Case study on creating an information site


Translation of the automotive information site from Netcat to Modx Revo.

Region: Vladimir

Cooperation beginning: 2015

The website of the Vladimir glossy automobile magazine About Cars came to my work in 2015. For the first time I worked with a news information portal, since the sites of all other clients are commercial.

The goal is to increase site traffic. You can do this by publishing interesting materials, promotions, contests, cooperation with other projects.


First of all, we need regular full-fledged self-written news. In the case of a Car, there were about 650 materials on the site. For a regional specialized site, this is enough to collect traffic. But visitors came mainly from social networks (sometimes more than 300 clicks per day) and from links from other sites. Why were there no transitions from search engines, although theoretically there could be hundreds of transitions per day.

There were many reasons. The first is the Netcat site management system. I confess, the deeper I dug, the more depressed I got from it.It all starts with the fact that she does not perceive the "quotation mark" in the code. It is necessary to use such a ' quotation mark. The logic of this CMS is frightening - to find a section where you need to be a detective to edit some piece of code. I especially liked where it was hidden robots.txt.

But the main problem of this CMS is duplicates. There were just an infinite number of them on the site.Only by tags, each news item was duplicated several times. No redirects, canonical tags, forbidding lines in the file robots.txt . Just an infinite number of takes. There are 667 pages on the site in total, and through Xenu link sleuth I found more than 15,000, then I got tired of waiting (only 30-40% of the search passed) and I quit this program. In Yandex.In the webmaster, there were 3134 pages in the "Robot Uploaded", and 153 pages in the "Pages in search". The cherry on the cake is the AGS filter.

By the way, duplicates were not only inside one domain, but also on the mirrors of the site, where there were no redirects. There were 6 such mirrors. These are domains with and without www, in the com, ru and RF zone.

So, work on the site has begun. First of all, we have determined which blocks are not needed: a forum, blogs, reviews, registration on the site, a whole module for buying / selling cars. Changed the appearance - the site has become easier to understand. Increased the width of the site, the width of the main blog for news and articles.

Updating the website of the automobile magazine About Cars

The loading time of the main page was reduced from 9 to 4.5 seconds.The weight of the main page decreased from 2.99 megabytes to 824 kilobytes, the number of browser requests from 90 to 32.

Then I started setting up the file robots.txt . As you can see, it's quite big now. All unwanted pages and duplicates were closed.

Indexing the website of the automobile magazine About Cars

With the appearance of files robots.txt and sitemap.xml by eliminating errors in the code and speeding up loading, Yandex started indexing pages normally.

Transitions from search engines

The big trouble was the dynamics of the reference mass. It can be seen that about 5,000 links have been purchased since February. Moreover, they turned out to be rental, without any filtering, and the most popular anchor was "Auto News". This is a direct way to ban search engines. The links were removed. But search engines need several months to move away from such a link explosion.

Reference explosion

Continuing to correct technical errors, I realized that it would be almost impossible to do it normally on this site management system. We have decided to move the site to Modx Revo. But there are hundreds of news and articles on the site - it is impractical to manually copy and paste. I wrote a parser, thanks to which I saved the necessary site information in a convenient CSV format. html page titles, headings, illustrations, and texts of articles were preserved. Then, through the importX plugin, this was added to Modx. There were no special difficulties.

In the process of parsing, a lot of links to left external sites were found - this was another of the points for search engines that the site should be banned.All links were automatically closed with the nofollow tag, some were simply deleted. It also turned out that there were no illustrations and pictures for a couple of hundred news items (that is, they were no longer on the old engine). I also found a lot of unwanted links to non-existent documents - a bad sign. Fixed everything.

Parsing articles from the old site

Lets cooperate!