Posts

Showing posts from 2010

How to enable clustering in Openfire Enterprise?

Image
What is clustering? A cluster is when you have several servers hosting the same domain. Before Openfire 3.4.0 it was possible to only have one machine hosting a domain. Even though a single machine can scale to very big numbers (e.g. more than 100K concurrent users) there is still a limit in the number of users you can scale. Moreover, if that machine suffers a problem and the server stops then all users will be affected. When using clustering you avoid those two problems. You will be naturally distributing the load among several machines thus even if one of them goes down the entire service will still continue to respond. The users that were connected to the machine that went down will just reconnect to any of the remaining machines. How do I use clustering in Openfire? Clustering is a commercial feature that is available in the Openfire Enterprise 3.4.0 or later. That means that you need to get Openfire Enterprise to be able to use clustering. If you have an existing Openfire Enterpr

Scraping an Entire Website using LINUX

Web scraping (also called Web harvesting or Web data extraction) is a computer software technique of extracting information from websites. Usually, such software programs simulate human exploration of the Web by either implementing low-level Hypertext Transfer Protocol (HTTP), or embedding certain full-fledged Web browsers, such as the Internet Explorer (IE) and the Mozilla Web browser. Web scraping is closely related to Web indexing, which indexes Web content using a bot and is a universal technique adopted by most search engines. In contrast, Web scraping focuses more on the transformation of unstructured Web content, typically in HTML format, into structured data that can be stored and analyzed in a central local database or spreadsheet. Web scraping is also related to Web automation, which simulates human Web browsing using computer software. Exemplary uses of Web scraping include online price comparison, weather data monitoring, website change detection, Web research, Web content