Copy a Website

Learning how to code websites in HTML and CSS can be a long and arduous process, especially if you are trying to teach yourself from scratch. While you can buy books that will walk you through the process of HTML coding, sometimes there are concepts you need to see in action to fully comprehend them. The ability to copy a website will allow you to you dissect the coding process, bit by bit, helping you to understand exactly how HTML coding works.

Steps

Windows

  1. Download and install HTTrack. If you want to copy an entire site, or a large number of pages from a site at once, you'll want the help of an automatic site downloader. Trying to manually save each page would be a much too time-consuming task, and these utilities will automate the entire process.
    • The most popular and powerful website copying program is HTTrack, an open source program available for Windows and Linux. This program can copy an entire site, or even the entire internet if configured (im)properly! You can download HTTrack for free from www.httrack.com.
  2. Set the destination for the copied files. Once you’ve opened HTTrack, you’ll need to set a destination folder for the website files. Make sure to create a dedicated folder for your website copies, or you may have difficulty tracking them down in the future.
    • Give your project a name to help you locate it. HTTrack will create a folder in your destination directory with your project name.
  3. Select "Download web site(s)" from the drop-down menu. This will ensure that HTTrack will download all of the content from the website, including any pictures or other files.
  4. Enter the address you want to copy. You can enter in multiple websites if you'd like to copy multiple sites in the same project directory. By default, HTTrack will grab every possible link from that website that stays on the same web server.
    • If the website you want to copy requires you to log in, use the “Add URL” button to enter the website address as well as the username and password.
  5. Start copying the website. Once you have entered your URLs, you can begin the copying process. Depending on the size of the website, the download process can take a significant amount of time and bandwidth to complete. HTTrack will show the progress of all the files you are copying to your computer.[1]
  6. Check out your copied website. Once the download is complete, you can open the copied website and browse it directly from your computer. Open any of the HTM or HTML files in a web browser to view the pages as if you were online. You can also open these files in a web page editor to see all of the code that makes them work. The files will be localized by default so that links point to the downloaded files and not to the website. This allows for completely offline viewing.

Mac

  1. Download SiteSucker from the Mac App Store. This is a free program that will allow you to download complete copies of websites. You can also download SiteSucker from the website at ricks-apps.com/osx/sitesucker/index.html.
    • If you download the app from the website, double-click the downloaded DMG file. Drag the SiteSucker application icon to your Applications folder to install it.
  2. Enter the URL of the website that you want to copy. With SiteSucker's default settings, every page on the website will be copied and downloaded to your computer. SiteSucker will follow every link it finds but will only download files from the same web server.
    • Advanced users can adjust the settings for SiteSucker, but if you just want to copy a website you don't need to worry about changing anything. SiteSucker will copy the complete website by default.
    • One setting that you may want to change is the location for the copied website on your computer. Click the Gear button to open the Settings menu. In the "General" section, use the "Destination" menu to select where you want the files to be saved.
  3. Click the "Download" button to begin saving the website. SiteSucker will start downloading all of the content from the website that you entered into the URL field. This can potentially take a very long time, and you can monitor the progress in the bottom part of the SiteSucker window.
  4. Enter your username and password if prompted. If you're trying to download content from a password protected site, you'll be prompted for your login information. By default, SiteSucker will check your Keychain first to see if your login information is already stored. If it isn't, you'll need to enter this information manually.[2]
  5. View your copied website after it is finished downloading. Once you have finished downloading the site, you can view it offline just as if you were online. SiteSucker will localize the web pages so that they point to the local downloaded files instead of the original online address. This allows you to view the entire site without an internet connection.

Warnings

  • Many webmasters have functions in place that will automatically notify them if their content shows up on the Web under someone else's site. Do not assume that content you can access is available for use. Always contact the webmaster or site owner before using anything for your own work.
  • Copying a website and then using it as your own is plagiarism. It can also be considered theft of intellectual property. Never use the content copied from another website as your own, although you can quote small portions of someone else's content if you provide attribution.

Sources and Citations