Whatever the reason may be, cloning a website is not as complicated as it may seem. This can be done for many reasons, such as creating a backup of the original site, developing a new version of the site on a different platform, or simply wanting to have an identical second site. How do you maintain your sites’ link “healthiness” ? Let us know in the comments.Clone website is a process of copying an already existing website and hosting it on a new domain or sub-domain. Thinking a bit outside of the box can make it easy to turn that mountain into a mole hill again. There are plenty of web crawlers available on many platforms – I have also used and recommend Xenu on Wine. Hopefully, you won’t have too many ERRORs, but if you do, it’s quite easy now to rectify them as you have a log showing exactly which redirects failed!ĭon’t let huge numbers of links frighten you into thinking you can’t make your site better. SiteSucker dutifully follows every link and reports its findings. This is because we already have all the relevant links in our file – no point in asking SiteSucker to recrawl the entire site for every original link (this would take days).įinally, drag the html file you created (containing the test or new domain name and all links) into the Web URL bar and let go. We want to enforce a maximum level of 1 now. In SiteSucker, go back into the settings and check the Limits. Now, we have a very basic html page containing all the links from the old site. Wrap it in with basic html & body tags and change the extension to. In your favorite editor search/replace the old domain with the test (or new) domain. Go ahead and take the extra time to wrap them up in a nice anchor tag (this will help in the next step).Ĭopy this file and rename it to reflect the new site where you want to test your redirects. Take the finished log output and snip away the unneeded text to the left and right of the url. You’re off to the races!įor ~20k links, it took almost 20 minutes for SiteSucker to grab them all. Then, enter the original sitename in the Web URL input and hit enter. #Sitesucker and wordpress sites download#What I wondered was how I could execute a web crawl not from a site, but from a saved file? Turns out, it’s very easy!įirst, ensure the settings of SiteSucker to log the download history (and save that log): #Sitesucker and wordpress sites mac#It’s extremely fast and the user interface is very lean (making use of Mac OS’s Console logging application). I’d used SiteSucker a few times in recent months to double check the health of our site’s link structure. And, of course, I knew we didn’t have that many valid pages in our site. I calculated at the rate I needed to go through the top 1000, I’d need almost 7 years to double check the rest. This was horribly redundant, mind-numbing work but I didn’t see a way to export all 5 MILLION URLs that Google Analytics had on on record for the last month. In fact, SiteSucker was able to confirm that, besides images, we did manage a 100% conversion for all the existing urls.Īt first, I spent about a day going through the top 1000 URLs according to the Google Analytics tracker. Ultimately, I had to make a compromise between feasibility and correctness, but I’m pretty satisified with the results. With over twenty thousand pages, it was no small task and I struggled finding a way to automate it. In relaunching one of our decade old platforms, we couldn’t afford to get bashed by fickle finger of Google and I needed to take extra care in ensuring all redirects were properly made. Coda My boss threw down the gauntlet Monday morning during our weekly meeting.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |