Need to save a web page or website to view it offline? Are you going to be offline for a long period of time but want to be able to browse your favorite website? If you are using Firefox, then there is one Firefox add-on that might fix your problem.
ScrapBook is a great Firefox extension that helps you save web pages and organize them in a way that makes them very easy to manage. The really cool thing about this add-on is that it is very lightweight, fast, accurately caches a local copy of a web page, almost perfect, and supports multiple languages. I tested it on several webpages with lots of graphics and fancy CSS styles and was surprisingly happy to see that the standalone version looks exactly the same as the online version.
You can use ScrapBook for the following purposes:
- Save one web page.
- Save a fragment or part of one web page.
- Save the entire website.
- Organize your collection just like bookmarks with folders and subfolders.
- Full-text search and fast filtering of the entire collection.
- Editing the assembled web page
- Text / HTML editing function reminiscent of Opera notes
Scrapbook installation
If you are using the latest version of Firefox, which at the time of this writing is v33 for me, you will have to change some settings so that you can use ScrapBook correctly. By default, the ScrapBook icon is not shown anywhere, so you can only use it when you right-click on a web page. Add a button to a toolbar or menu by right-clicking anywhere on the toolbar and choosing Customize.
ScrapBook“/>
In the Settings screen, on the left, you will see the ScrapBook icon. Go ahead and drag it to either the toolbar at the top or the menu. Then click the “Exit Settings” button.
ScrapBook“/>
Before we move on to using ScrapBook to save a website, you can change the add-in settings. You can do this by clicking the menu button in the upper right corner (three horizontal lines) and then clicking Add-ons.
ScrapBook“/>
Now click on Extensions and then click on the Options button next to the ScrapBook add-in.
ScrapBook“/>
Here you can change keyboard shortcuts, storage location, and other minor settings.
ScrapBook“/>
Use scrapbook to download websites
Now let’s take a closer look at the actual use of the program. First, download the website for which you want to download web pages. The easiest way to start loading is to right-click anywhere on the page and select “Save Page” or “Save Page As” at the bottom of the menu. These two options are added by ScrapBook
ScrapBook“/>
Save Page will allow you to select a folder and then automatically save only the current page. If you need more options, which I usually do, click the Save Page As button. Another dialog box will open where you can choose from a variety of options.
ScrapBook“/>
The important sections are Options, Download Linked Files, and Detailed Save Options. By default, ScrapBook loads images and styles, but you can add JavaScript if the website requires this to work as expected.
The Download Linked Files section will only download linked images, but you can also download sounds, movie files, archive files, or specify the exact file type to download. This is a really useful option if you are on a website with many links to certain file types (Word documents, PDFs, etc.) and want to quickly download all of the related files.
Finally, the Verbose Save option is how you load large portions of the website. By default, it is set to 0, which means that it will not follow links to other pages on the site or other links for that matter. If you choose one, it will load the current page and everything related to that page. Depth 2 will load from the current page, from the 1st linked page and any links from the 1st linked page.
ScrapBook“/>
Click the “Save” button, a new window will open and the pages will start loading. You need to hit the Pause button immediately and I will explain why. If you just let ScrapBook run, it will start downloading everything from the page, including any material in the source code that may link to many other sites or ad networks. As seen in the image above, outside of the main site (labnol.org), it downloads ads from googleadservices.com and something from ctrlq.org.
Are you sure you want ads to appear on the site while you are viewing it offline? This will also waste time and bandwidth, so it’s best to click Pause and then click the Filter button.
ScrapBook“/>
The two best options are Restrict to Domain and Restrict to Directory. They are usually the same, but on some sites they may be different. If you know exactly what pages you want, you can even filter by string and enter your own URL. This option is great because it gets rid of all the other junk and only downloads content from the website you’re on, not from social media sites, ad networks, etc.
Go ahead and click Start and the pages will start loading. The download time will depend on the speed of your internet connection and how much on the site you are downloading. The add-on works great for most sites, and the only problem I’ve encountered is that on some sites, the URLs they use to link to their own content are absolute URLs.
The problem with absolute URLs is that when you open the index page in Firefox offline and try to click any of the links, it will try to load from the actual website, not from the local cache. In such cases, you will have to manually open the download directory and open the pages. It’s frustrating, and I’ve only had it on a few sites, but it happens. You can view your downloads folder by clicking the ScrapBook button in the toolbar and then right-clicking the site and choosing Tools – Show Files.
ScrapBook“/>
In Explorer, sort by type, and then scroll down to the files named HTML Document. Content pages are usually default_00x files, not index_00x files.
If you are not using Firefox and still want to download web pages to your computer, you can also use WinHTTrack, which will automatically download the entire website for offline viewing. However, WinHTTrack takes up a lot of space, so make sure you have enough free space on your hard drive.
Both programs are good for loading entire websites or for loading individual web pages. In practice, loading an entire website is nearly impossible due to the sheer number of links generated by CMS software such as WordPress, etc. If you have any questions, please leave a comment. Enjoy!
–
ScrapBook“/>