|
||||||||||||||||||||||||||||||||||||||
|
Список поисковых ботов. Exabot ng браузерЗащита сайта от нежелательных ботов, спам-ботов и парсеровБольшие сайты с тысячами страниц и интернет-магазины с большим ассортиментом товаров зачастую сталкиваются с проблемой внезапно высокой нагрузки на сервер. Причиной очень часто становятся не ddos атаки, вирусы или действия хакеров, а обычные роботы малоизвестных поисковых систем или различных сервисов, которые за счет большого количества запросов к сайту в единицу времени приводят к увеличению нагрузки и превышению допустимых на хостинге лимитов. Замечу, что данная проблема актуальна именно для крупных интернет-магазинов, поскольку, если ваш сайт состоит из 100-500 страниц и менее, то даже средний хостинг справится с такой внезапной нагрузкой без особых проблем. VDS серверы способны выдерживать куда более высокие нагрузки и, как правило, для интернет-магазинов на VDS такая проблема ощутима только в период новогоднего бума или накануне праздников, когда серверы работают на пределе своих возможностей. Определить причину внезапно высокой нагрузки на сервер порой можно только через анализ логов, но иногда достаточно Яндекс Метрики, которая порой принимает ботов за пользователей. Признаки появления бота на сайте
Примеры нежелательных ботовНежелательные боты зачастую это вовсе не боты спамеры или парсеры сайтов. Очень часто эти боты представляют различные сервисы или малоизвестные поисковые системы. Прямой угрозы они не несут, но из-за неправильной настройки, внутренней ошибки или по каким-либо другим причинам они могут создавать высокую нагрузку на сайт за счет большого количества хитов в единицу времени. Бот MJ12botПоисковый робот сервиса Majestic, которые собирает данные об исходящих ссылках на сайтах. Робот нормально воспринимает канонические страницы, но на сайтах, где канонические урлы отсутствуют, начинает очень сильно "буксовать" на страницах, в урлах которых содержатся параметры. Бот BLEXBotРобот BLEXBot Crawler заявлен как робот поисковой системы - какой именно поисковой системы, на официальной странице не уточняется. Бот AhrefsBotЭтот робот, равно как и MJ12bot, анализирует страницы сайта на наличие внешних ссылок. Сам сайт ahrefs.com предоставляет наплохой профессиональный сервис по оценке и анализу ссылочного. Бот HubSpot WebcrawlerЭто робот поисковой системы сайта amazon.com. На самом "Амазоне" заявлено, что данная платформа создана для компаний, которым нужно привлечь посетителей, т.е. по сути робот является сборщиком контента с интернет-магазинов. Другие нежелательные ботыПриведенный ниже перечень ботов мной лично не встречался, но вполне возможно, что они ещё существуют: Aboundex80legs360SpiderJavaCogentbotAlexibotasteriasattachBackDoorBotBackWebBanditBatchFTPBigfootBlack.HoleBlackWidowBlowFishBotALotBuddyBuiltBotToughBullseyeBunnySlippersCegbfeiehCheeseBotCherryPickerChinaClawCollectorCopierCopyRightCheckcosmosCrescentCustoAIBOTDISCoDIIbotDittoSpyderDownload DemonDownload DevilDownload WonderdragonflyDripeCatchEasyDLebingbongEirGrabberEmailCollectorEmailSiphonEmailWolfEroCrawlerExabotExpress WebPicturesExtractorEyeNetIEFoobotflunkyFrontPageGo-Ahead-Got-ItgotitGrabNetGrafulaHarvesthloaderHMViewHTTrackhumanlinksIlseBotImage StripperImage SuckerIndy LibraryInfoNavibotInfoTekiesIntelliseekInterGETInternet NinjaIriaJakartaJennyBotJetCarJOCJustViewJyxobotKenjin.SpiderKeyword.DensitylarbinLexiBotlftplibWeb/clsHTTPlikseLinkextractorProLinkScan/8.1a.UnixLNSpiderguyLinkWalkerlwp-trivialLWP::SimpleMagnetMag-NetMarkWatchMass DownloaderMata.HariMemoMicrosoft.URLMicrosoft URL ControlMIDown toolMIIxpcMirrorMissigua LocatorMister PiXmogetMozilla/3.Mozilla/2.01Mozilla.*NEWTNAMEPROTECTNavroadNearSiteNetAntsNetcraftNetMechanicNetSpiderNet VampireNetZIPNextGenSearchBotNGNICErsPROniki-botNimbleCrawlerNinjaNPbotOctopusOffline ExplorerOffline NavigatorOpenfindOutfoxBotPageGrabberPapa FotopavukpcBrowserPHP version trackerPockeyProPowerBot/2.14ProWebWalkerpsbotPumpQueryN.MetasearchRealDownloadReaperRecorderReGetRepoMonkeyRMASiphonSiteSnaggerSlySearchSmartDownloadSnakeSnapbotSnoopysogouSpaceBisonSpankBotspannerSqwormStripperSuckerSuperBotSuperHTTPSurfbotsuzuranSzukacz/1.4tAkeOutTeleportTelesoftTurnitinBot/1.5The.IntraformantTheNomadTightTwatBotTitanTrue_botturingosTurnitinBotURLy.WarningVacuumVCIVoidEYEWeb Image CollectorWeb SuckerWebAutoWebBanditWebclipping.comWebCopierWebEMailExtrac.*" botWebEnhancerWebFetchWebGo ISWeb.Image.CollectorWebLeacherWebmasterWorldForumBotWebReaperWebSaugerWebsite eXtractorWebsite QuesterWebsterWebStripperWebWhackerWebZIPWhackerWidowWISENutbotWWWOFFLEWWW-Collector-EXaldonXenuZeusZmEuZyborgAhrefsBotarchive.org_botbingbotWgetAcunetixFHscan Ограничение активности ботов с использованием robots.txtУниверсальное решение заключается в том, чтобы через дерективу Crawl-delay в файле robots.txt ограничить количество запросов. Численное значение указывает паузу в секундах между обращениями к сайту. Типовой пример, подходящий для большинства сайтов User-agent: *Crawl-delay: 10 10 секунд более чем достаточно, чтобы лимитировать нарузку на сайт роботов поисковых машин. Впрочем, некоторые нежелательные боты игнорируют данную директиву и даже прямой запрет доступа через robots.txt не спасает от высокой нагрузки. User-agent: MJ12botDisallow: / В таких случаях остается только вариант блокирования доступа к сайта по ip адресам, откуда идут запросы, или по User-agent. Второй вариант является более предпочтительным, поскольку при блокировании доступа по ip, сайт становится недоступен для всех устройств, в том числе и для обычных пользователей. Блокирование ботов по User-agent через .htaccess файлПодавляющее большинство сайтов работает на linux платформе, где роль веб-сервера выполняет Apache сервер. Веб-сервер обрабатывает запросы пользователей и отдает страницы сайта. Для блокирования доступа по User-agent необходимо в корень сайта добавить файл .htaccess (если его там ещё нет) и дописать следующие строки. SetEnvIfNoCase User-Agent "Aboundex" botSetEnvIfNoCase User-Agent "80legs" botSetEnvIfNoCase User-Agent "360Spider" botSetEnvIfNoCase User-Agent "^Java" botSetEnvIfNoCase User-Agent "^Cogentbot" botSetEnvIfNoCase User-Agent "^Alexibot" botSetEnvIfNoCase User-Agent "^asterias" botSetEnvIfNoCase User-Agent "^attach" botSetEnvIfNoCase User-Agent "^BackDoorBot" botSetEnvIfNoCase User-Agent "^BackWeb" botSetEnvIfNoCase User-Agent "Bandit" botSetEnvIfNoCase User-Agent "^BatchFTP" botSetEnvIfNoCase User-Agent "^Bigfoot" botSetEnvIfNoCase User-Agent "^Black.Hole" botSetEnvIfNoCase User-Agent "^BlackWidow" botSetEnvIfNoCase User-Agent "^BlowFish" botSetEnvIfNoCase User-Agent "^BotALot" botSetEnvIfNoCase User-Agent "Buddy" botSetEnvIfNoCase User-Agent "^BuiltBotTough" botSetEnvIfNoCase User-Agent "^Bullseye" botSetEnvIfNoCase User-Agent "^BunnySlippers" botSetEnvIfNoCase User-Agent "^Cegbfeieh" botSetEnvIfNoCase User-Agent "^CheeseBot" botSetEnvIfNoCase User-Agent "^CherryPicker" botSetEnvIfNoCase User-Agent "^ChinaClaw" botSetEnvIfNoCase User-Agent "Collector" botSetEnvIfNoCase User-Agent "Copier" botSetEnvIfNoCase User-Agent "^CopyRightCheck" botSetEnvIfNoCase User-Agent "^cosmos" botSetEnvIfNoCase User-Agent "^Crescent" botSetEnvIfNoCase User-Agent "^Custo" botSetEnvIfNoCase User-Agent "^AIBOT" botSetEnvIfNoCase User-Agent "^DISCo" botSetEnvIfNoCase User-Agent "^DIIbot" botSetEnvIfNoCase User-Agent "^DittoSpyder" botSetEnvIfNoCase User-Agent "^Download\ Demon" botSetEnvIfNoCase User-Agent "^Download\ Devil" botSetEnvIfNoCase User-Agent "^Download\ Wonder" botSetEnvIfNoCase User-Agent "^dragonfly" botSetEnvIfNoCase User-Agent "^Drip" botSetEnvIfNoCase User-Agent "^eCatch" botSetEnvIfNoCase User-Agent "^EasyDL" botSetEnvIfNoCase User-Agent "^ebingbong" botSetEnvIfNoCase User-Agent "^EirGrabber" botSetEnvIfNoCase User-Agent "^EmailCollector" botSetEnvIfNoCase User-Agent "^EmailSiphon" botSetEnvIfNoCase User-Agent "^EmailWolf" botSetEnvIfNoCase User-Agent "^EroCrawler" botSetEnvIfNoCase User-Agent "^Exabot" botSetEnvIfNoCase User-Agent "^Express\ WebPictures" botSetEnvIfNoCase User-Agent "Extractor" botSetEnvIfNoCase User-Agent "^EyeNetIE" botSetEnvIfNoCase User-Agent "^Foobot" botSetEnvIfNoCase User-Agent "^flunky" botSetEnvIfNoCase User-Agent "^FrontPage" botSetEnvIfNoCase User-Agent "^Go-Ahead-Got-It" botSetEnvIfNoCase User-Agent "^gotit" botSetEnvIfNoCase User-Agent "^GrabNet" botSetEnvIfNoCase User-Agent "^Grafula" botSetEnvIfNoCase User-Agent "^Harvest" botSetEnvIfNoCase User-Agent "^hloader" botSetEnvIfNoCase User-Agent "^HMView" botSetEnvIfNoCase User-Agent "^HTTrack" botSetEnvIfNoCase User-Agent "^humanlinks" botSetEnvIfNoCase User-Agent "^IlseBot" botSetEnvIfNoCase User-Agent "^Image\ Stripper" botSetEnvIfNoCase User-Agent "^Image\ Sucker" botSetEnvIfNoCase User-Agent "Indy\ Library" botSetEnvIfNoCase User-Agent "^InfoNavibot" botSetEnvIfNoCase User-Agent "^InfoTekies" botSetEnvIfNoCase User-Agent "^Intelliseek" botSetEnvIfNoCase User-Agent "^InterGET" botSetEnvIfNoCase User-Agent "^Internet\ Ninja" botSetEnvIfNoCase User-Agent "^Iria" botSetEnvIfNoCase User-Agent "^Jakarta" botSetEnvIfNoCase User-Agent "^JennyBot" botSetEnvIfNoCase User-Agent "^JetCar" botSetEnvIfNoCase User-Agent "^JOC" botSetEnvIfNoCase User-Agent "^JustView" botSetEnvIfNoCase User-Agent "^Jyxobot" botSetEnvIfNoCase User-Agent "^Kenjin.Spider" botSetEnvIfNoCase User-Agent "^Keyword.Density" botSetEnvIfNoCase User-Agent "^larbin" botSetEnvIfNoCase User-Agent "^LexiBot" botSetEnvIfNoCase User-Agent "^lftp" botSetEnvIfNoCase User-Agent "^libWeb/clsHTTP" botSetEnvIfNoCase User-Agent "^likse" botSetEnvIfNoCase User-Agent "^LinkextractorPro" botSetEnvIfNoCase User-Agent "^LinkScan/8.1a.Unix" botSetEnvIfNoCase User-Agent "^LNSpiderguy" botSetEnvIfNoCase User-Agent "^LinkWalker" botSetEnvIfNoCase User-Agent "^lwp-trivial" botSetEnvIfNoCase User-Agent "^LWP::Simple" botSetEnvIfNoCase User-Agent "^Magnet" botSetEnvIfNoCase User-Agent "^Mag-Net" botSetEnvIfNoCase User-Agent "^MarkWatch" botSetEnvIfNoCase User-Agent "^Mass\ Downloader" botSetEnvIfNoCase User-Agent "^Mata.Hari" botSetEnvIfNoCase User-Agent "^Memo" botSetEnvIfNoCase User-Agent "^Microsoft.URL" botSetEnvIfNoCase User-Agent "^Microsoft\ URL\ Control" botSetEnvIfNoCase User-Agent "^MIDown\ tool" botSetEnvIfNoCase User-Agent "^MIIxpc" botSetEnvIfNoCase User-Agent "^Mirror" botSetEnvIfNoCase User-Agent "^Missigua\ Locator" botSetEnvIfNoCase User-Agent "^Mister\ PiX" botSetEnvIfNoCase User-Agent "^moget" botSetEnvIfNoCase User-Agent "^Mozilla/3.Mozilla/2.01" botSetEnvIfNoCase User-Agent "^Mozilla.*NEWT" botSetEnvIfNoCase User-Agent "^NAMEPROTECT" botSetEnvIfNoCase User-Agent "^Navroad" botSetEnvIfNoCase User-Agent "^NearSite" botSetEnvIfNoCase User-Agent "^NetAnts" botSetEnvIfNoCase User-Agent "^Netcraft" botSetEnvIfNoCase User-Agent "^NetMechanic" botSetEnvIfNoCase User-Agent "^NetSpider" botSetEnvIfNoCase User-Agent "^Net\ Vampire" botSetEnvIfNoCase User-Agent "^NetZIP" botSetEnvIfNoCase User-Agent "^NextGenSearchBot" botSetEnvIfNoCase User-Agent "^NG" botSetEnvIfNoCase User-Agent "^NICErsPRO" botSetEnvIfNoCase User-Agent "^niki-bot" botSetEnvIfNoCase User-Agent "^NimbleCrawler" botSetEnvIfNoCase User-Agent "^Ninja" botSetEnvIfNoCase User-Agent "^NPbot" botSetEnvIfNoCase User-Agent "^Octopus" botSetEnvIfNoCase User-Agent "^Offline\ Explorer" botSetEnvIfNoCase User-Agent "^Offline\ Navigator" botSetEnvIfNoCase User-Agent "^Openfind" botSetEnvIfNoCase User-Agent "^OutfoxBot" botSetEnvIfNoCase User-Agent "^PageGrabber" botSetEnvIfNoCase User-Agent "^Papa\ Foto" botSetEnvIfNoCase User-Agent "^pavuk" botSetEnvIfNoCase User-Agent "^pcBrowser" botSetEnvIfNoCase User-Agent "^PHP\ version\ tracker" botSetEnvIfNoCase User-Agent "^Pockey" botSetEnvIfNoCase User-Agent "^ProPowerBot/2.14" botSetEnvIfNoCase User-Agent "^ProWebWalker" botSetEnvIfNoCase User-Agent "^psbot" botSetEnvIfNoCase User-Agent "^Pump" botSetEnvIfNoCase User-Agent "^QueryN.Metasearch" botSetEnvIfNoCase User-Agent "^RealDownload" botSetEnvIfNoCase User-Agent "Reaper" botSetEnvIfNoCase User-Agent "Recorder" botSetEnvIfNoCase User-Agent "^ReGet" botSetEnvIfNoCase User-Agent "^RepoMonkey" botSetEnvIfNoCase User-Agent "^RMA" botSetEnvIfNoCase User-Agent "Siphon" botSetEnvIfNoCase User-Agent "^SiteSnagger" botSetEnvIfNoCase User-Agent "^SlySearch" botSetEnvIfNoCase User-Agent "^SmartDownload" botSetEnvIfNoCase User-Agent "^Snake" botSetEnvIfNoCase User-Agent "^Snapbot" botSetEnvIfNoCase User-Agent "^Snoopy" botSetEnvIfNoCase User-Agent "^sogou" botSetEnvIfNoCase User-Agent "^SpaceBison" botSetEnvIfNoCase User-Agent "^SpankBot" botSetEnvIfNoCase User-Agent "^spanner" botSetEnvIfNoCase User-Agent "^Sqworm" botSetEnvIfNoCase User-Agent "Stripper" botSetEnvIfNoCase User-Agent "Sucker" botSetEnvIfNoCase User-Agent "^SuperBot" botSetEnvIfNoCase User-Agent "^SuperHTTP" botSetEnvIfNoCase User-Agent "^Surfbot" botSetEnvIfNoCase User-Agent "^suzuran" botSetEnvIfNoCase User-Agent "^Szukacz/1.4" botSetEnvIfNoCase User-Agent "^tAkeOut" botSetEnvIfNoCase User-Agent "^Teleport" botSetEnvIfNoCase User-Agent "^Telesoft" botSetEnvIfNoCase User-Agent "^TurnitinBot/1.5" botSetEnvIfNoCase User-Agent "^The.Intraformant" botSetEnvIfNoCase User-Agent "^TheNomad" botSetEnvIfNoCase User-Agent "^TightTwatBot" botSetEnvIfNoCase User-Agent "^Titan" botSetEnvIfNoCase User-Agent "^True_bot" botSetEnvIfNoCase User-Agent "^turingos" botSetEnvIfNoCase User-Agent "^TurnitinBot" botSetEnvIfNoCase User-Agent "^URLy.Warning" botSetEnvIfNoCase User-Agent "^Vacuum" botSetEnvIfNoCase User-Agent "^VCI" botSetEnvIfNoCase User-Agent "^VoidEYE" botSetEnvIfNoCase User-Agent "^Web\ Image\ Collector" botSetEnvIfNoCase User-Agent "^Web\ Sucker" botSetEnvIfNoCase User-Agent "^WebAuto" botSetEnvIfNoCase User-Agent "^WebBandit" botSetEnvIfNoCase User-Agent "^Webclipping.com" botSetEnvIfNoCase User-Agent "^WebCopier" botSetEnvIfNoCase User-Agent "^WebEMailExtrac.*" botSetEnvIfNoCase User-Agent "^WebEnhancer" botSetEnvIfNoCase User-Agent "^WebFetch" botSetEnvIfNoCase User-Agent "^WebGo\ IS" botSetEnvIfNoCase User-Agent "^Web.Image.Collector" botSetEnvIfNoCase User-Agent "^WebLeacher" botSetEnvIfNoCase User-Agent "^WebmasterWorldForumBot" botSetEnvIfNoCase User-Agent "^WebReaper" botSetEnvIfNoCase User-Agent "^WebSauger" botSetEnvIfNoCase User-Agent "^Website\ eXtractor" botSetEnvIfNoCase User-Agent "^Website\ Quester" botSetEnvIfNoCase User-Agent "^Webster" botSetEnvIfNoCase User-Agent "^WebStripper" botSetEnvIfNoCase User-Agent "^WebWhacker" botSetEnvIfNoCase User-Agent "^WebZIP" botSetEnvIfNoCase User-Agent "Whacker" botSetEnvIfNoCase User-Agent "^Widow" botSetEnvIfNoCase User-Agent "^WISENutbot" botSetEnvIfNoCase User-Agent "^WWWOFFLE" botSetEnvIfNoCase User-Agent "^WWW-Collector-E" botSetEnvIfNoCase User-Agent "^Xaldon" botSetEnvIfNoCase User-Agent "^Xenu" botSetEnvIfNoCase User-Agent "^Zeus" botSetEnvIfNoCase User-Agent "ZmEu" botSetEnvIfNoCase User-Agent "^Zyborg" botSetEnvIfNoCase User-Agent "AhrefsBot" botSetEnvIfNoCase User-Agent "HubSpot" botSetEnvIfNoCase User-Agent "BLEXBot" botSetEnvIfNoCase User-Agent "archive.org_bot" botSetEnvIfNoCase User-Agent "bingbot" botSetEnvIfNoCase User-Agent "^Wget" botDeny from env=bot Понятно, что можно использовать данный перечень в исходном виде, а можно оставить в списке только тех нежелательных ботов, которые действительно создавали в прошлом и создают высокую нагрузку на ваш сайт на данный момент. up66.ru Список поисковых ботов5IBM_Planetwide ABCdatos BotLink Acme.Spider Acoon-Robot 4.0.0RC2 Acoon-Robot 4.0.1 Acoon-Robot 4.0.1 Acoon-Robot 4.0.2 Acorn/Nutch-0.9 adressendeutschland.de AdsBot-Google AdsBot-Google b Ahoy! The Homepage Finder aipbot/1.0 aipbot/2-beta Alkaline Almaden Almaden bc12 Almaden bc14 Almaden bc22 Almaden bc5 Almaden bc6 Almaden fc13 Almaden hc4 Amfibibot/0.07 ananzi Anthill appie 1.1 Arachnophilia Arale Araneo AraybOt ArchitextSpider archive.org_bot/heritrix/1.13.1 Aretha ARIADNE arks ASAHA Search Engine Turkey V.001 ASAPlinkchecker/1.0 AsapLinkChecker/1.0 b ASAP-LynxViewer/1.0 ASAP-LynxViewer/1.0 b ASAP-LynxViewer/1.1 ASAP-LynxViewer/1.2 ASAP-Web-Sniffer/1.0 Ask Ask Jeeves/Teoma Ask Jeeves/Teoma - b Ask Jeeves/Teoma - c Ask Jeeves/Teoma - d AskJeeves ASpider ATN Worldwide Atomz.com Search Robot AURESYS BackRub Baiduspider Baiduspider b Bay Spider BBot BecomeBot/2.3 BecomeBot/2.3 b BecomeBot/3.0 BecomeBot/3.0 b Big Brother Bigsearch.ca/Nutch-0.9-dev Bigsearch.ca/Nutch-1.0-dev Bjaaland BlackWidow Blaiz-Bee/2.00.5622 Blaiz-Bee/2.00.5655 Blaiz-Bee/2.00.6082 Blaiz-Bee/2.00.8315 Bloodhound boitho.com-dc/0.79 boitho.com-dc/0.83 boitho.com-dc/0.85 boitho.com-dc/0.86 Borg-Bot BoxSeaBot bright.net caching robot BSpider btbot/0.4 BuzzRankingBot/1.0 CACTVS Chemistry Spider Calif Cassandra CazoodleBot a CazoodleBot b CazoodleBot c CazoodleBot d CazoodleBot-0.1 ubee/10.0 Cfetch cfetch/1.0 ChangeDetection changedetection/1.0 Charlotte/1.0b Checkbot ChristCrawler.com churl cIeNcIaFiCcIoN.nEt City4you/1.3 Cesky CJB.NET Proxy CloakDetect/0.9 ClusoBotImage ClusoBotImage/1.0 ClusoBotImage/1.0 b ClusoBotOnline/1.0 ClusoBotOnline/1.0 b Collective Combine System Combine/3 ComputingSite Robi Conceptbot ConfuzzledBot ConveraCrawler 0.9d ConveraCrawler 0.9e ConveraMultiMediaCrawler/0.1 CoolBot Crawllybot/0.1 Crawllybot/0.1 b Crawllybot/0.1 c csci_b659/0.13 Cusco CyberSpyder Link Test CydralSpider DataFountains at Dmoz DataFountains at Dmoz b DataparkSearch/4.40 DAUMOA/1.0.0 DAUMOA/1.0.1 DAUMOA/1.0.1 b del.icio.us-thumbnails/1.0 DepSpid DepSpid/5.07 DepSpid/5.10 DepSpid/5.24 DepSpid/5.25 DepSpid/5.26 Desert Realm Spider dev-spider2.searchpsider.com/1.3b DeWeb(c) Katalog/Index Die Blinde Kuh DienstSpider Digger Digimarc MarcSpider Digital Integrity Robot Direct Hit Grabber DNAbot DownLoad Express DragonBot DuckDuckBot/1.0 DWCP EARTHCOM.info/1.98 EARTHCOM.info/1.99 EARTHCOM.info/2.01 EARTHCOM.info/2.03 EARTHCOM.info/2.05 EARTHCOM.info/2.06 EARTHCOM.info/2.07 EARTHCOM.info/2.09 EARTHCOM.info/2.1 EARTHCOM/2.2 EbiNess e-collector EDI/1.6.5 EDI/1.6.6 EDI/1.6.6 b egothor/11.0d egothor/11.0d b egothor/8.0f egothor/8.0g EIT Link Verifier Robot ejupiter.com ejupiter.com 43 ELFINBOT Emacs-w3 Search Engine EnaBot/1.1 EnaBot/1.2 Enterprise_Search/1.00.143 envolk/1.7 envolk/1.7 b envolk/1.7 c envolk[ITS]spider/1.6 esculapio e-SocietyRobot Esther Evliya Celebi Exabot Test/3.0 Exabot/2.0 Exabot/3.0 Exabot/3.0 b Exabot-Images/1.0 Exabot-Images/3.0 Exabot-Test/1.0 ExactSEEK Factbot 1.09 FAST Enterprise Crawler 6 at virk.dk FAST Enterprise Crawler/6 FAST Enterprise Crawler/6.4 FAST MetaWeb Crawler FastCrawler favorstarbot/1.0 Feedster Crawler/3.0 Felix IDE FetchRover fido Findexa Crawler findlinks/0.966 findlinks/0.971 findlinks/0.973 findlinks/0.975 findlinks/0.976 findlinks/1.0 findlinks/1.0.8 findlinks/1.0.9 findlinks/1.0.9-a2 findlinks/1.01 findlinks/1.1 findlinks/1.1.1 findlinks/1.1.1-a1 findlinks/1.1.1-a2 findlinks/1.1.1-a5 findlinks/1.1.2-a2 findlinks/1.1.2-a3 findlinks/1.1.2-a4 findlinks/1.1.2-a5 findlinks/1.1.3-beta1 findlinks/1.1.3-beta2 findlinks/1.1.3-beta4 findlinks/1.1.3-beta6 findlinks/1.1.3-beta7 findlinks/1.1.3-beta8 findlinks/1.1.3-beta9 findlinks/1.1.4-beta1 findlinks/1.1-a4 findlinks/1.1-a5 findlinks/1.1-a7 findlinks/1.1-a8 findlinks/1.1-a9 Fish search flatlandbot flatlandbot b flatlandbot c flatlandbot d Fluid Dynamics Search Engine robot Forschungsportal/0.8-dev Fouineur Francis/2.0 Freecrawl FunnelBack FunnelWeb FurlBot/Furl Search 2.0 FyberSpider/1.2 g2crawler Gaisbot/3.0 Gaisbot/3.0 - 06 Gallent Search Spider v1.4 Robot 3 gammaSpider gazz GCreep genieBot a genieBot b GetBot GetterroboPlus Puu GetURL Giant/1.0 Gigabot Gigabot/1.0 Gigabot/2.0 Gigabot/2.0 - b Gigabot/2.0 - c Gigabot/2.0 - d Gigabot/2.0 b Gigabot/3.0 Girafabot Girafabot b Girafabot c GOFORITBOT Golem Griffon Gromit Gulper Bot GurujiBot/1.0 GurujiBot/1.0 b HamBot Harvest HatenaScreenshot/1.0 (checker) havIndex heeii/Nutch-0.9-dev at heeii.com heritrix - webarchiv.cz_bot/1.12.1 heritrix at worio.com heritrix/1.10.0 at worio.com heritrix/1.10.1 at researcher.cz heritrix/1.10.2 at i.stanford.edu heritrix/1.10.2 at yacy.net heritrix/1.10.2 at zvents.com heritrix/1.12.1 at edu.org heritrix/1.12.1 at netarkivet.dk heritrix/1.12.1 at newstin.com heritrix/1.12.1 at newtestbabes.com heritrix/1.12.1 at page-store.com heritrix/1.12.1 at page-store.com b heritrix/1.12.1 at webarchiv.cz heritrix/1.4.0 at webarchiv.cz heritrix/1.6.0 at researcher.cz heritrix/1.6.0 at webarchiv.cz heritrix/1.6.0 at worio.com heritrix/1.7.1 at netarkivet.dk heritrix/1.7.3 at webarchiv.cz heritrix/1.8.0 at crawlerx51.com heritrix/1.8.0 at webarchiv.cz heritrix/1.9.0 at webarchiv.cz HI HiddenMarket-1.0-beta HKU WWW Octopus hl_ftien_spider_v1.1 holmes/3.10 - morfeo holmes/3.10.1 - onet.pl holmes/3.11 - morfeo holmes/3.11 - onet.pl holmes/3.12 - morfeo holmes/3.13 - morfeo holmes/3.7 - morfeo holmes/3.8 - morfeo holmes/3.8 - morfeo B holmes/3.9 - morfeo holmes/3.9 - onet.pl holmes/3.9 - onet.pl b Hometown Spider Pro HooWWWer/2.1.3 HooWWWer/2.2.0 ht://Dig html_analyzer HTMLgobble Hyper-Decontextualizer iajaBot image.kapsi.net Imagelock IncyWincy Informant InfoSeek Robot Infoseek Sidewinder InfoSpiders Ingrid Inktomi Slurp Inspector Web IntelliAgent Internet Cruiser Robot Internet Shinchakubin Iron33 Israeli-search JavaBee JBot Java Web Robot JCrawler Jeeves Jobot JoeBot JumpStation Katipo KDD-Explorer Kilroy KIT-Fireball KO_Yappo_Robot LabelGrabber larbin legs Link Validator LinkScan LinkWalker Lockon logo.gif Crawler Lycos Mac WWWWorm Magpie marvin Mattie MediaFox MerzScope MindCrawler mnoGoSearch search engine software moget MOMspider Monster Motor Muncher Muninn Muscat Ferret Mwd.Search NDSpider NEC-MeshExplorer Nederland.zoek NetCarta WebMap Engine Netcraft NetMechanic NetScoop newscan-online NHSE Web Forager Nomad Northern Light Gulliver nzexplorer ObjectsSearch OntoSpider Open Text Index Robot Openfind data gatherer Orb Search Pack Rat PageBoy ParaSite Patric pegasus PerlCrawler PGP Key Agent Phantom PhpDig PiltdownMan Pioneer Poppi Popular Iconoclast Portal Juice Spider PortalB Spider psbot Raven Search RBSE Spider Resume Robot RixBot RoadHouse Crawling System Robbie the Robot RoboCrawl Spider RoboFox Robot Robot Francoroute Robozilla Roverbot RuLeS SafetyNet Robot Scooter SearchProcess Senrigan SG-Scout ShagSeeker Shai’Hulud Sift Simmany Robot Ver Site Searcher Site Valet SiteTech-Rover Skymob.com SLCrawler Sleek Smart Spider Snooper Solbot Spanner Speedy Spider spider_monkey SpiderBot Spiderline Crawler SpiderMan SpiderView Spry Wizard Robot Suke suntek search engine Sven Sygol TACH Black Widow Tarantula tarspider Tcl W Robot TechBOT Templeton TeomaTechnologies The Jubii Indexing Robot The NorthStar Robot The NWI Robot The Peregrinator The Python Robot The TkWWW Robot The Web Moose The Web Wombat The Webfoot Robot the World Wide Web Wanderer The World Wide Web Worm TITAN TitIn TLSpider UCSD Crawl UdmSearch UptimeBot URL Check URL Spider Pro Valkyrie Verticrawl Victoria vision-search void-bot Voyager VWbot Walhello appie WallPaper WebBandit Web Spider WebCatcher WebCopy webfetcher Webinator weblayers WebLinker Weblog Monitor WebMirror WebQuest WebReaper webs Websnarf WebSpider WebStolperer WebVac webwalk WebWalker WebWatch WebZinger Wget whatUseek Winona WhoWhere Robot Wired Digital WISENutbot WM wmir WWWC XGET XYLEME Robot n-wp.ru Browser Extension Privacy Policy - GhosteryEffective Date: March 8, 2018 I. Introduction:The Ghostery Browser Extension (“GBE”) is owned by Cliqz International GmbH (“Cliqz”), which is headquartered at Arabellastrasse 23, 81925 Munich, Germany, (“Company”). The Company as the responsible body under the German Data Protection Law takes the protection of your personal data very seriously and will always offer you the GBE and its functionality with your privacy in mind. We also recognize that the GBE is popular because people want to be informed and empowered. We share the belief that privacy, when done right, is empowering, and that is why privacy is central to the GBE and new functionality that we may add. Therefore, we only collect data to build our products for the benefits of our users and we believe that we as a company should never have any personal data (“Personal Data) about our users unless they affirmatively provide it to us. The core functionality of the GBE is to inform users what third-party tracking technologies (“Trackers”) are tracking them on any given website so individuals can exercise personal control over that activity by blocking them for a cleaner, faster, safer browsing experience. II. Basis to Collect and Use Personal DataThere is no obligation on your part to provide your Personal Data. However, if you do, we have a legitimate interest to collect and use it, namely so we can provide products or services, or complete a transaction with you. III. Notion of Personal DataPersonal Data means any information concerning the personal or material circumstances of an identified or identifiable individual such as name and age. Non-personal data are all data that cannot be used to identify an individual, such as statistics about usage of a website. IV. What Personal Data are collectedUser Account: Many GBE users had previously requested the ability to open accounts so they can receive product information and also take advantage of new GBE functionality. You are not required to open a user account in order to use the GBE. If you choose to open a user account, you can do so either when you download the GBE, or any other time through the GBE settings. If you choose to create a user account, we will collect the following Personal Data: name, email address. At any time you can deactivate your user account, at which time you will no longer have access to the services that a user account offers. IP-Address: We do not differentiate between static or dynamic IP addresses – that is driven at the user level – but please see the Security section below to learn more about the security measures we take to protect data – including your IP-address – that the GBE collects. V. How Personal Data are usedThe use of the GBE Personal Data that we collect when you open an account is used for: (i) syncing your GBE settings across browsers and devices, (ii) serving as your login credentials, and (iii) communicating directly to you through your email address in order to give you information about our products, services, updates and upgrades (in certain cases for a fee). IP-addresses are solely collected for geolocation purposes but only on Zip Code level or above (for example city, county, continent) to improve the GBE. We never store IP addresses. VI. Collection of Non-Personal DataWhen you download the GBE, it collects on an ongoing basis the following data: web browser, operating systems, usage statistics, when an installation, upgrade, or uninstallation occurs, and whether the GBE is active or engaged by you. The use of the aforementioned non-personal data is limited to: (i) communicating through the CMP (see VIII.) – since we don’t have your name or email address – in order to share product information or updates and Company news, (ii) for internal analytical purposes such as accurately counting the number of browser extension downloads, or (iii) surveying our users from time to time. VII. The Consumer Messaging Platform (“CMP”)The CMP is used from time to time as a way for us to effectively and generically communicate to our users, while still honoring their privacy. The CMP is automatically turned on, but you can easily turn it off by going to the GBE options page and following the instructions provided. If you turn off the CMP, you can still use the GBE, but you won’t receive any generic communications from us. VIII. OffersOffers, also known as Ghostery Rewards, is turned on by default and allows companies to show relevant marketing offers to users based upon an algorithm we created that anonymously determines intent and therefore particular commercial offers that may be of interest to you. This new functionality does not rely upon collected Personal Data and you can opt out of it at any time. IX. Human WebWe developed a technology called Human Web, which is turned on by default, and creates anonymous group models that power the private quick-search, anti-tracking and anti-phishing technologies featured in Cliqz and Ghostery products. Data Collection: In order for Human Web to function we automatically collect non-private URLs, search queries along with search engine results pages, suspicious URLs that could potentially be phishing websites, information related to safe and unsafe trackers, and information related to the prevalence and performance of Trackers. Data Use: The data that we collect so Human Web can work is anonymized, aggregated and transmitted through the Human Web Proxy Network and used to improve the search, anti-tracking and anti-phishing features in Cliqz and Ghostery products. For further information please go to https://cliqz.com/whycliqz/human-web. X. Data Processing AbroadAlthough the Company is located in Germany, it partly operates out of the United States. The data we collect, personal or otherwise, are located on servers based in the United States. If you are accessing or using GBE from the European Union or other regions with laws governing data collection and use that may differ from U.S. law, please note that you may be allowing the collection or transferring of your personal data in or to the U.S. However, we have a strong data privacy framework in place to ensure an adequate level of protection for your Personal Data. XI. Data RetentionIf you deactivate your user account, the Company retains the collected Personal Data if and for as long as it may be required by law (for example to fulfill retention periods prescribed by law) or judicial order. The Company will use this personal data only for those purposes and retains it only as long as prescribed by law. After that the Personal Data will be deleted. XII. SecurityThe Company has reasonable and appropriate technical, physical and administrative safeguards in place for a company of our size and complexity to protect the data that is collected. Some of the specific security measures we take include instantly hashing the origination IP addresses using very strong encryption technology to protect your privacy, whereupon the collected IP addresses and user agent information is destroyed. In addition, to further preserve your security, the GBE does not collect any information on URLs beyond the path query string. XIII. Contacting the CompanyAt any time the user has the right to object any use of his personal data and can do so by writing to the Company at the physical address provided in the beginning of the document or by emailing the Company at [email protected] If you object will be necessary to prove that you are the owner of the account. The Company has the right to answer your inquiry electronically. Please contact us – for this and all other inquiries, comments or concerns about these practices – by email at [email protected]. XIV. Changes to Privacy PolicyWe may occasionally change this Privacy Policy and when we do, we will also revise the “Effective Date” at the top of the Privacy Policy. If we make any material changes to our Privacy Policy, we will try to inform you via the CMP and, if you opened an account and gave us your email address, then we will also try to contact you through the email address you provided about those material changes. Ultimately, however, it is your responsibility to periodically review this Privacy Policy to stay informed about our data practices and any changes to them. Your continued use of the GBE constitutes your agreement to this Privacy Policy and any changes to it. www.ghostery.com |
|
||||||||||||||||||||||||||||||||||||
|
|