Site-specific article extraction rules to aid content extractors, feed readers, and 'read later' applications. https://www.fivefilters.org/full-text-rss/
Find a file
2026-03-04 15:00:28 +01:00
.about.com.txt
.allthingsd.com.txt
.asahi.com.txt add .asahi.com.txt and biz-journal.jp.txt (#1911) 2026-03-04 15:00:28 +01:00
.blog.163.com.txt
.blog.hu.txt Create .blog.hu.txt (#1465) 2024-10-28 01:05:57 +01:00
.blogs.nytimes.com.txt
.blogspot.com.txt Update .blogspot.com.txt 2023-10-09 19:57:07 +02:00
.businessinsider.com.txt
.cnet.com.txt
.craigslist.org.txt
.ctv.ca.txt
.denfaminicogamer.jp.txt add .denfaminicogamer.jp.txt (#1827) 2025-12-20 18:59:32 +01:00
.dreamwidth.org.txt
.dxy.cn.txt
.elpais.com.txt Update elpais.com (#1007) 2022-11-02 09:59:04 +01:00
.etc.se.txt
.ew.com.txt
.fivefilters.org.txt
.fok.nl.txt
.gitattributes
.gitignore
.globo.com.txt Globo.com (#1494) 2024-11-16 08:52:26 +01:00
.hardware.info.txt
.ietf.org.txt Update .ietf.org.txt 2023-11-08 08:21:27 +01:00
.ifeng.com.txt
.ihned.cz.txt
.itmedia.co.jp.txt feat: Add configuration files for .itmedia.co.jp and atmarkit.itmedia.co.jp (#1797) 2025-12-02 19:22:14 +01:00
.lingolia.com.txt Update .lingolia.com.txt (#1624) 2025-05-25 09:32:52 +02:00
.livejournal.com.txt
.m.wikihow.com.txt
.medium.com.txt Medium.com (#1169) 2023-07-25 06:45:51 +02:00
.metafilter.com.txt
.mitpress.mit.edu.txt Create .mitpress.mit.edu.txt 2022-07-14 21:41:16 -04:00
.mozilla.org.txt
.nasa.gov.txt Create .nasa.gov.txt (#1462) 2024-10-27 17:49:35 +01:00
.nytimes.com.txt Fix fetching nytimes.com articles (#1000) 2022-10-14 22:19:17 +02:00
.onliner.by.txt
.orf.at.txt
.over-blog.com.txt Create .over-blog.com.txt (#1682) 2025-06-24 23:45:16 +02:00
.philhist.unibas.ch.txt Rename .unibas.ch.txt to .philhist.unibas.ch.txt 2022-01-25 22:10:25 +01:00
.playblackdesert.com.txt Create .playblackdesert.com.txt 2021-03-03 18:04:17 +01:00
.quora.com.txt Update .quora.com.txt 2020-11-15 20:43:43 +01:00
.readthedocs.io.txt
.redbullmusicacademy.com.txt Add redbullmusicacademy.com config (#1022) 2023-01-02 07:02:59 +01:00
.repubblica.it.txt
.robweychert.com.txt add robweychert.com (#1097) 2023-06-19 09:05:59 +02:00
.rollingstone.com.txt Create .rollingstone.com.txt (#1775) 2025-10-11 19:50:17 +02:00
.schwab.com.txt
.signal-arnaques.com.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
.simonwillison.net.txt Update .simonwillison.net.txt (#1265) 2023-12-10 11:22:50 +01:00
.slashdot.org.txt slashdot: replace i tags with blockquote (#929) 2022-02-15 07:16:41 +01:00
.smashingmagazine.com.txt
.sodexo.com.txt
.sputniknews.com.txt
.stackexchange.com.txt Fix extracting body on stackexchange sites (#1785) 2025-11-09 13:55:42 +01:00
.stanford.edu.txt
.statista.com.txt Create statista.com / es.statista.com (#852) 2021-01-20 13:51:38 +01:00
.substack.com.txt Subst2 (#1713) 2025-07-15 10:07:06 +02:00
.theinventory.com.txt Create .theinventory.com.txt 2021-01-01 17:04:37 +01:00
.theonion.com.txt
.theplayerstribune.com.txt Create .theplayerstribune.com.txt (#1408) 2024-07-26 15:46:36 +02:00
.time.com.txt
.tvbs.com.tw.txt Update .tvbs.com.tw.txt (#1337) 2024-02-16 01:54:30 +01:00
.tweakblogs.net.txt
.usinenouvelle.com.txt
.vanityfair.com.txt Update .vanityfair.com.txt (#1415) 2024-08-01 21:37:11 +02:00
.visualcapitalist.com.txt Create .visualcapitalist.com.txt (#1115) 2023-06-26 06:33:38 +02:00
.watson.de.txt Update .watson.de.txt 2024-07-25 15:04:02 +02:00
.wikihow.com.txt
.wikimedia.org.txt
.wikipedia.org.txt remove redundant math so only image is kept. (#979) 2022-06-13 06:08:11 +02:00
.wired.com.txt Create .wired.com.txt (#1569) 2025-03-12 14:46:41 +01:00
.wordpress.com.txt chore: update wordpress (#1813) 2026-02-06 17:27:23 +01:00
.wp.pl.txt Update .wp.pl.txt (#1039) 2023-01-27 13:55:45 +01:00
.wyborcza.biz.txt Wyborcza (#1379) 2024-05-21 15:02:38 +02:00
.wyborcza.pl.txt Wyborcza (#1379) 2024-05-21 15:02:38 +02:00
.yahoo.com.txt Changed 3 Yahoo configs (#1400) 2024-07-07 10:12:10 +02:00
01net.com.txt Create 01net.com.txt (#1586) 2025-04-15 08:01:16 +02:00
3quarksdaily.com.txt
3voor12.vpro.nl.txt
5by5.tv.txt
7newsbelize.com.txt
8e-etage.fr.txt
9gag.com.txt
9to5google.com.txt Create 9to5google.com.txt (#1322) 2024-01-27 06:37:55 +01:00
9to5mac.com.txt
16personalities.com.txt chore: add working rule without JS for 16personalities (#1883) 2026-02-26 17:17:49 +01:00
20min.ch.txt
20minutes.fr.txt Update 20minutes.fr.txt 2024-05-27 09:26:21 +02:00
24.ae.txt
24a11y.com.txt
24auto.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
24garten.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
24hamburg.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
24joursdeweb.fr.txt fix: add title and body to 24joursdeweb (#1808) 2025-12-07 23:28:27 +01:00
24rhein.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
24vita.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
24ways.org.txt
36kr.com.txt
37signals.com.txt
43folders.com.txt
404media.co.txt Update 404media.co.txt 2025-06-25 17:43:22 -04:00
500px.com.txt
512pixels.net.txt
a.tldrnewsletter.com.txt add support for a.tldrnewsletter.com (#1478) 2024-11-03 14:46:42 +01:00
a11ywithlindsey.com.txt
aachener-nachrichten.de.txt
aarp.org.txt Update aarp.org.txt 2021-05-12 16:40:02 +02:00
abc-luxe.com.txt
abc.es.txt
abc.net.au.txt Update abc.net.au.txt (#1181) 2023-08-13 23:15:31 +02:00
abcnews.go.com.txt
abendblatt.de.txt Funke (#1373) 2024-04-30 21:37:22 +02:00
abendzeitung-muenchen.de.txt Create abendzeitung-muenchen.de.txt (#1185) 2023-08-20 12:55:29 +02:00
abplive.com.txt Create abplive.com.txt 2024-05-02 10:36:18 +02:00
absolument-tout.net.txt Create absolument-tout.net.txt (#1757) 2025-09-06 09:35:09 +02:00
academic.oup.com.txt Shtrom 2023 03 (#1059) 2023-03-08 12:42:57 +01:00
academiedugout.fr.txt
accaglobal.com.txt Create accaglobal.com.txt 2022-06-13 00:36:28 +02:00
access.redhat.com.txt fix: body rule for redhat.com (#1897) 2026-02-26 17:58:46 +01:00
accesstoinsight.org.txt
achgut.com.txt add config for achgut.com (#984) 2022-07-18 06:55:23 +02:00
acidcow.com.txt
aclu.org.txt Shtrom 2020 09 (#802) 2020-09-14 16:45:53 +02:00
acroswing.fr.txt
actualitte.com.txt
ad.nl.txt
addendum.org.txt
adfc-nrw.de.txt
adme.ru.txt
adslzone.net.txt Update adslzone.net (#1005) 2022-11-02 09:58:34 +01:00
aei.org.txt
aeon.co.txt Update stripping rules in aeon.co.txt (#1792) 2025-11-24 08:49:33 +01:00
aerobuzz.fr.txt Add aerobuzz.fr.txt (#883) 2021-05-14 00:46:52 +02:00
afr.com.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
africaintelligence.fr.txt
aftenposten.no.txt
aftonbladet.se.txt
agirpourlatransition.ademe.fr.txt Create agirpourlatransition.ademe.fr.txt (#1298) 2024-01-07 14:39:28 +01:00
aht.seriouseats.com.txt
aif.ru.txt Update and add some configs (#835) 2020-12-28 11:05:02 +01:00
aitnews.com.txt
akweb.de.txt Add configuration for ak - analyse & kritik (#834) 2020-12-17 20:15:24 +01:00
al-monitor.com.txt Create al-monitor.com.txt (#1247) 2023-11-14 22:53:36 +01:00
albayan.ae.txt
alberta.ca.txt Create alberta.ca.txt 2020-11-28 12:57:43 +01:00
alex.mullr.net.txt
alexduner.com.txt
alexmurrell.co.uk.txt Create alexmurrell.co.uk.txt (#1072) 2023-03-30 13:40:12 +02:00
alexwlchan.net.txt Create alexwlchan.net.txt (#1641) 2025-06-03 12:56:43 +02:00
alicewalkersgarden.com.txt Create alicewalkersgarden.com.txt (#1359) 2024-04-08 13:55:06 +02:00
aligneddev.net.txt Create aligneddev.net.txt (#1659) 2025-06-13 16:25:17 +02:00
alimentation-generale.fr.txt
alistapart.com.txt Update alistapart.com.txt (#863) 2021-03-06 23:48:49 +01:00
aljazeera.com.txt
allafrica.com.txt
allgemeine-zeitung.de.txt Update allgemeine-zeitung.de.txt 2023-10-15 14:59:48 +02:00
allphly.com.txt Create allphly.com.txt (#1209) 2023-09-25 05:44:03 +02:00
allrecipes.com.txt
allthingsd.com.txt
allyou.com.txt
alphabeta.argaam.com.txt
alriyadh.com.txt
alsacreations.com.txt
alseraj.net.txt
altaonline.com.txt Create altaonline.com.txt 2022-07-14 21:10:36 -04:00
alternatives-economiques.fr.txt Update alternatives-economiques.fr.txt (#1032) 2023-01-19 19:51:45 +01:00
alternator.science.txt Create alternator.science.txt (#1492) 2024-11-12 14:09:31 +01:00
alternet.org.txt
altfoto.com.txt
alumni.stanford.edu.txt
amandala.com.bz.txt
amazon.com.txt
americandrink.net.txt
americanprogress.org.txt Create americanprogress.org.txt (#1555) 2025-02-04 09:32:20 +01:00
americanthinker.com.txt
americastestkitchenfeed.com.txt
amp.themercury.com.au.txt
amptoons.com.txt
anandtech.com.txt
androidandme.com.txt
androidcentral.com.txt Create androidcentral.com.txt (#1335) 2024-02-10 22:38:14 +01:00
androidpolice.com.txt Update androidpolice.com.txt (#1621) 2025-05-16 13:47:18 +02:00
andy-bell.design.txt
angrymetalguy.com.txt
annatravelling.wordpress.com.txt
annouchka.fr.txt
ansible.com.txt add config for ansible.com (#1094) 2023-06-16 21:55:33 +02:00
answers.microsoft.com.txt Create answers.microsoft.com.txt (#1688) 2025-06-28 07:00:18 +02:00
answersresearchjournal.org.txt Create answersresearchjournal.org.txt 2023-03-21 00:37:21 +01:00
antigone21.com.txt Create antigone21.com.txt (#1463) 2024-10-27 18:36:25 +01:00
antirez.com.txt
aoc.media.txt
apache.be.txt
apnews.com.txt chore: add rules for apnews.com (#1884) 2026-02-24 15:01:17 +01:00
apotheke-adhoc.de.txt fix: bad test_contains directives (#1874) 2026-02-20 18:09:13 +01:00
apple.com.txt Create apple.com.txt 2023-11-06 15:39:29 +01:00
apple.news.txt
appleinsider.com.txt
appleweblog.com.txt
aps.dz.txt Create aps.dz.txt (#1044) 2023-02-06 07:01:47 +01:00
araraneon.com.br.txt Create araraneon.com.br.txt (#1416) 2024-08-02 08:38:12 +02:00
archdaily.com.txt
archiloque.net.txt Create archiloque.net.txt 2021-01-06 16:44:55 +01:00
architecturaldigest.com.txt Create architecturaldigest.com.txt (#1495) 2024-11-16 10:44:58 +01:00
archive.pressthink.org.txt
archiveofourown.org.txt Update archiveofourown.org.txt (#1630) 2025-05-27 22:01:09 +02:00
archlinux.de.txt Create archlinux.de.txt (#1556) 2025-02-07 14:08:26 +01:00
arduino-tutorial.de.txt
arretsurimages.net.txt LPL (#1788) 2025-11-18 21:24:33 +01:00
arstechnica.com.txt Update arstechnica.com.txt (#1471) 2024-10-30 21:27:50 +01:00
artforum.com.txt Update artforum.com.txt 2024-07-15 14:23:53 +02:00
articles.courant.com.txt
articles.washingtonpost.com.txt
artofmanliness.com.txt
artresilia.com.txt Create artresilia.com.txt (#1634) 2025-05-30 15:46:53 +02:00
artsixmic.fr.txt
arxiv-vanity.com.txt Update arxiv-vanity.com.txt 2023-06-08 15:29:31 +02:00
arxiv.org.txt add publication date and author to arXiv (#1745) 2025-08-12 18:46:25 +02:00
as-web.jp.txt add as-web.jp.txt and mainichi.jp.txt (#1822) 2025-12-17 20:01:19 +01:00
asahi.com.txt add below site configs (#1849) 2026-01-17 12:51:20 +01:00
ascarter.net.txt
ascii.jp.txt Add replace(h2) and use strip id or class (#1828) 2025-12-22 09:34:12 +01:00
askingbox.de.txt
askubuntu.com.txt Create askubuntu.com.txt (#1772) 2025-10-09 20:36:35 +02:00
astronews.com.txt
astronomy.com.txt
asymco.com.txt
atlantico.fr.txt
atlasobscura.com.txt Create atlasobscura.com.txt (#1576) 2025-03-28 08:12:57 +01:00
atmarkit.itmedia.co.jp.txt Add replace(h2) and use strip id or class (#1828) 2025-12-22 09:34:12 +01:00
au.lifehacker.com.txt Update and rename lifehacker.com.au.txt to au.lifehacker.com.txt (#1528) 2024-12-12 16:38:55 +01:00
au.news.yahoo.com.txt Changed 3 Yahoo configs (#1400) 2024-07-07 10:12:10 +02:00
audiobookshelf.org.txt Create audiobookshelf.org.txt (#1755) 2025-09-02 08:41:48 +02:00
auto-motor-und-sport.de.txt add config for auto-motor-und-sport.de (#1302) 2024-01-11 02:04:16 +01:00
autoactu.com.txt
autoblog.com.txt
autocar.co.uk.txt Update autocar.co.uk.txt (#1672) 2025-06-17 09:05:55 +02:00
autocrypt.org.txt
automobil-produktion.de.txt Create automobil-produktion.de.txt (#1357) 2024-04-07 04:05:42 +02:00
autoplus.fr.txt Update autoplus.fr.txt 2020-10-07 12:53:47 +02:00
avantivictoirerao.com.txt
avclub.com.txt The kinja sites updated their engine and now they tag their body content using "js_post-content" instead of just "post-content" (#917) 2021-11-29 19:45:01 +01:00
awealthofcommonsense.com.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
aws.amazon.com.txt
axesslab.com.txt
axiocap.com.txt Update axiocap.com.txt 2024-02-23 03:29:22 +01:00
axios.com.txt Update axios.com.txt (#1752) 2025-08-27 08:49:31 +02:00
az-online.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
az.lib.ru.txt Add az.lib.ru 2025-11-24 19:42:42 +01:00
backlinko.com.txt
bahnblogstelle.com.txt Update bahnblogstelle.com.txt (#1130) 2023-07-03 16:57:16 +02:00
baltimoresun.com.txt
banglarrannaghor.com.txt Add scraping rules for banglarrannaghor.com (#1815) 2025-12-13 01:02:21 +01:00
barrons.com.txt Create barrons.com.txt (#1509) 2024-12-01 08:09:10 +01:00
baseballprospectus.com.txt
basicthinking.de.txt
basketeurope.com.txt
bastamag.net.txt
bastibe.de.txt Create bastibe.de.txt (#1505) 2024-11-22 08:36:35 +01:00
batenka.ru.txt Create batenka.ru.txt 2021-10-17 09:52:46 +02:00
baylon-industries.com.txt
bayometric.com.txt Add bayometric.com.txt with scraping details (#1839) 2026-01-06 18:38:28 +01:00
bbc.co.uk.txt Shtrom 2024 03 (#1347) 2024-03-05 11:52:33 +01:00
bbc.com.txt Update bbc.com.txt 2024-07-18 11:51:17 +02:00
bbcgoodfood.com.txt Bbcgoodfood (#1407) 2024-07-26 13:52:51 +02:00
bbva.es.txt Create bbva.es.txt 2022-10-25 11:18:12 +02:00
bdaily.co.uk.txt Create bdaily.co.uk.txt 2022-07-06 12:14:32 -04:00
bearmetal.eu.txt
becomingminimalist.com.txt
begeek.fr.txt
ben-evans.com.txt Create ben-evans.com.txt (#1423) 2024-08-23 08:05:13 +02:00
benoitmaison.org.txt
berliner-zeitung.de.txt Update berliner-zeitung.de.txt (#1771) 2025-10-09 08:35:35 +02:00
berlingske.dk.txt
bernama.com.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
bernardinai.lt.txt Create bernardinai.lt.txt (#1382) 2024-05-27 22:00:17 +02:00
besabine.com.txt Create besabine.com.txt (#1500) 2024-11-17 10:26:33 +01:00
bestcarweb.jp.txt add bestcarweb.jp.txt (#1896) 2026-03-02 09:32:50 +01:00
betabeat.com.txt
betanews.com.txt
bez.es.txt
bgland24.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
bild.de.txt
biography.com.txt
birthdayshoes.com.txt
bit-tech.net.txt
bitelia.com.txt
biz-journal.jp.txt add .asahi.com.txt and biz-journal.jp.txt (#1911) 2026-03-04 15:00:28 +01:00
bizjournals.com.txt
bjango.com.txt
blaetter.de.txt Create blaetter.de.txt (#1762) 2025-09-20 12:54:15 +02:00
blast-info.fr.txt blast: remove navigation links (#1838) 2026-01-05 10:35:57 +01:00
bleacherreport.com.txt
blog.angular.io.txt create 3 new configs (#1149) 2023-07-12 06:28:40 +02:00
blog.asmartbear.com.txt
blog.chriszacharias.com.txt
blog.cloudflare.com.txt
blog.dropbox.com.txt
blog.eleven-labs.com.txt
blog.eng.xogrp.com.txt
blog.engineering.publicissapient.fr.txt Create blog.engineering.publicissapient.fr.txt (#891) 2021-08-17 16:56:19 +02:00
blog.fefe.de.txt
blog.google.txt Create blog.google.txt (#1656) 2025-06-12 15:52:38 +02:00
blog.imirhil.fr.txt
blog.instagram.com.txt
blog.instapaper.com.txt
blog.kaelig.fr.txt
blog.landr.com.txt Site config for blog.landr.com (#946) 2022-03-04 13:53:34 +01:00
blog.lepine.pro.txt Create blog.lepine.pro.txt (#1013) 2022-11-28 22:50:40 +01:00
blog.lumen.com.txt Create blog.lumen.com.txt (#1503) 2024-11-22 07:39:55 +01:00
blog.mochi.is.txt Create blog.mochi.is.txt (#1534) 2024-12-19 09:22:58 +01:00
blog.mondediplo.net.txt Create blog.mondediplo.net.txt (#1520) 2024-12-06 07:41:29 +01:00
blog.mozilla.org.txt fix: mozilla blog selectors (#1892) 2026-02-26 13:29:36 +01:00
blog.native-instruments.com.txt Create blog.native-instruments.com.txt 2021-06-09 21:40:12 +02:00
blog.naver.com.txt fix: bad test_contains directives (#1874) 2026-02-20 18:09:13 +01:00
blog.netinfluence.ch.txt
blog.nightly.mozilla.org.txt
blog.octo.com.txt Create blog.octo.com.txt (#892) 2021-07-09 08:26:56 +02:00
blog.pchome.net.txt
blog.pinboard.in.txt
blog.professeurjoachim.com.txt Add blog.professeurjoachim.com.txt (#1514) 2024-12-05 08:49:48 +01:00
blog.rchapman.org.txt Create blog.rchapman.org.txt (#1459) 2024-10-25 00:23:45 +02:00
blog.renren.com.txt
blog.robertelder.org.txt Add blog.robertelder.org.txt (#980) 2022-06-13 06:07:39 +02:00
blog.rust-lang.org.txt Update blog.rust-lang.org.txt 2023-10-17 10:11:11 +02:00
blog.sentry.io.txt Sentry.io (#1140) 2023-07-07 17:02:14 +02:00
blog.serverlessadvocate.com.txt Create blog.serverlessadvocate.com.txt (#1145) 2023-07-12 06:28:14 +02:00
blog.shaunfinglas.co.uk.txt Shtrom 2020 09 (#802) 2020-09-14 16:45:53 +02:00
blog.sina.com.cn.txt
blog.spu.edu.txt
blog.squad.fr.txt
blog.stenmans.org.txt Create blog.stenmans.org.txt (#1736) 2025-08-07 03:43:41 +02:00
blog.terkel.io.txt Create blog.terkel.io.txt 2023-02-08 00:23:16 +01:00
blog.trello.com.txt
blog.twitter.com.txt
blog.wells.ee.txt
blog.xebia.fr.txt
blog.youb.fr.txt
blogs.faz.net.txt
blogs.forbes.com.txt
blogs.gnome.org.txt
blogs.lse.ac.uk.txt Create blogs.lse.ac.uk.txt (#1501) 2024-11-19 08:07:34 +01:00
blogs.oracle.com.txt Create blogs.oracle.com.txt 2023-11-10 15:23:38 +01:00
blogs.reuters.com.txt
blogs.sciencemag.org.txt
blogs.smithsonianmag.com.txt
blogs.technet.com.txt
bloomberg.com.txt Update bloomberg.com.txt (#1545) 2025-01-08 16:03:52 +01:00
boagworld.com.txt
boards.greenhouse.io.txt Create boards.greenhouse.io.txt (#1197) 2023-09-04 07:07:41 +02:00
bobbyhiltz.com.txt added bobbyhiltz.com (#1799) 2025-12-03 17:52:04 +01:00
bobbyromeo.com.txt
bohaishibei.com.txt
boingboing.net.txt Add title and date extraction to boingboing.net (#1835) 2026-01-04 07:46:14 +01:00
bonpote.com.txt Update bonpote.com.txt (#1411) 2024-07-29 15:50:34 +02:00
book.douban.com.txt
bookforum.com.txt
borderhouseblog.com.txt
bosch-presse.de.txt
bostonglobe.com.txt Update bostonglobe.com.txt (#1256) 2023-11-29 16:43:46 +01:00
bostonreview.net.txt Update bostonreview.net.txt 2022-07-14 21:54:13 -04:00
boundlessline.org.txt
boxingnewsonline.net.txt
bpb.de.txt Update bpb.de.txt 2025-04-21 00:38:44 +02:00
br.de.txt Create br.de.txt 2023-11-11 14:16:28 +01:00
brainfacts.org.txt
brainpickings.org.txt
brandeins.de.txt
brandingstrategyinsider.com.txt
brasil.elpais.com.txt fix: elpais body rule (#1885) 2026-02-23 19:41:12 +01:00
braunschweiger-zeitung.de.txt Funke (#1373) 2024-04-30 21:37:22 +02:00
breitengrad-nord.de.txt Create breitengrad-nord.de.txt (#1344) 2024-02-28 09:19:31 +01:00
brentozar.com.txt Create brentozar.com.txt (#866) 2021-03-17 21:00:18 +01:00
brettterpstra.com.txt
briefly.co.za.txt Create briefly.co.za.txt 2021-12-21 14:51:30 +01:00
brightside.me.txt
brit.co.txt Create brit.co.txt (#1401) 2024-07-07 10:25:29 +02:00
brookings.edu.txt Add brookings.edu.txt (#865) 2021-03-15 01:40:41 +01:00
brooksreview.net.txt
brucelawson.co.uk.txt
bt.no.txt
buerstaedter-zeitung.de.txt Update buerstaedter-zeitung.de.txt 2023-10-15 15:01:44 +02:00
buffed.de.txt Update buffed.de.txt 2023-10-24 00:02:22 +02:00
buildvirtual.net.txt Create buildvirtual.net.txt (#1474) 2024-10-31 00:22:16 +01:00
bunshun.jp.txt add below site configs (#1849) 2026-01-17 12:51:20 +01:00
buquad.com.txt
business-standard.com.txt Create business-standard.com.txt 2024-09-26 11:57:12 +02:00
business.time.com.txt
business2community.com.txt
businessinsider.com.au.txt
businessinsider.com.txt Update businessinsider.com.txt (#1637) 2025-05-31 14:54:34 +02:00
businessinsider.jp.txt add tokyo-np.co.jp.txt and businessinsider.jp.txt (#1823) 2025-12-18 13:42:33 +01:00
businessnews.com.tn.txt
businessweek.com.txt
buzzfeed.com.txt
buzzfeed.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
bw24.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
bzg.fr.txt Create bzg.fr.txt (#1593) 2025-04-17 22:52:23 +02:00
c.newsnow.co.uk.txt
c.newsnow.com.txt
cabinetmagazine.org.txt Create cabinetmagazine.org.txt 2021-10-10 11:42:37 +02:00
cable.co.uk.txt
cafebabel.com.txt
caffereggio.net.txt
callistaenterprise.se.txt Create callistaenterprise.se.txt (#1464) 2024-10-28 00:42:30 +01:00
canardpc.com.txt Update canardpc.com.txt (#1458) 2024-10-30 03:12:47 +01:00
canonrumors.com.txt
captaineconomics.fr.txt
car-it.com.txt
caranddriver.com.txt Create caranddriver.com.txt (#1648) 2025-06-08 09:54:13 +02:00
caravanmagazine.in.txt Create caravanmagazine.in.txt 2022-11-09 00:59:25 +01:00
cardboardconnection.com.txt
carlchenet.com.txt
carnegie.ru.txt Rename carnegie.ru.tx to carnegie.ru.txt 2022-02-17 22:22:59 +01:00
carnegieeurope.eu.txt Added carnegieeurope.eu (#824) 2020-10-16 11:50:26 +02:00
cars.com.txt
caseinterview.com.txt Update caseinterview.com.txt 2021-06-19 16:30:13 +02:00
cashless.pl.txt
catapult.co.txt Update catapult.co.txt 2020-10-28 10:29:40 +01:00
catb.org.txt
cbsnews.com.txt Revamp CBS News (#1908) 2026-03-04 01:59:31 +01:00
cell.com.txt Update XPath selector for article body (#1842) 2026-01-08 18:44:22 +01:00
cert-bund.de.txt Make the feed from cert-bund.de more useful (#921) 2022-01-13 11:35:20 +01:00
certaintynews.com.txt Update certaintynews.com.txt 2023-11-30 11:23:48 +01:00
cfclrk.com.txt Create cfclrk.com.txt (#1574) 2025-03-26 16:03:18 +01:00
cgtrader.com.txt Cgtrader (#1640) 2025-06-01 01:51:30 +02:00
champeau.info.txt
channelnewsasia.com.txt Update channelnewsasia.com.txt 2026-02-12 16:43:26 +01:00
chaperonsetvous.fr.txt Create chaperonsetvous.fr.txt (#981) 2022-06-14 16:50:00 +02:00
chareidi.org.txt
charlotteobserver.com.txt
chat.openai.com.txt Update chat.openai.com.txt 2023-11-08 15:47:00 +01:00
chefkoch.de.txt
chicagotribune.com.txt Create chicagotribune.com.txt 2021-05-26 00:25:07 +02:00
chiemgau24.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
china-gadgets.de.txt add china-gadgets.de config (#1309) 2024-01-15 23:56:26 +01:00
chip.de.txt Create chip.de.txt (#1424) 2024-08-24 09:20:16 +02:00
choice.com.au.txt Update choice.com.au.txt 2026-02-20 19:16:38 +01:00
chomsky.info.txt chore: add body and fix author for chomsky.info (#1886) 2026-02-24 13:41:05 +01:00
chrisltd.com.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
christianitytoday.com.txt
christies.com.txt
chrome.google.com.txt
chronicle.com.txt Update chronicle.com.txt 2020-08-24 02:17:41 +02:00
ciaosamin.com.txt
cicero.de.txt
cio.com.txt Idg (#1786) 2025-11-12 16:15:45 +01:00
ciperchile.cl.txt
cityam.com.txt Update cityam.com.txt 2024-05-20 19:07:22 +02:00
citylab.com.txt
cjr.org.txt
clarin.com.txt Create clarin.com.txt (#1166) 2023-07-23 08:28:13 +02:00
classcentral.com.txt Create classcentral.com.txt (#1502) 2024-11-22 06:19:02 +01:00
cleafy.com.txt Update cleafy.com.txt (#1350) 2024-03-09 13:10:37 +01:00
cleantechnica.com.txt
clientk.com.txt
cloud.google.com.txt Update cloud.google.com.txt 2026-02-15 19:38:08 +01:00
cloudacademy.com.txt
clubic.com.txt Update clubic.com.txt 2025-10-13 10:39:26 +02:00
cmace.de.txt
cmns.umd.edu.txt Shtrom 2022 04 (#966) 2022-04-22 09:57:46 +02:00
cmswire.com.txt
cn.engadget.com.txt
cn.nytimes.com.txt Create cn.nytimes.com.txt 2022-11-08 23:07:05 +01:00
cn.reuters.com.txt
cnbc.com.txt Update cnbc.com.txt 2021-05-19 22:58:04 +02:00
cnet.com.txt Update cnet.com.txt 2025-06-25 17:54:26 -04:00
cnetfrance.fr.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
cnews.ru.txt Update cnews.ru.txt (#1553) 2025-01-28 17:19:37 +01:00
cnn.com.txt Update cnn.com.txt (#1798) 2025-12-03 17:56:50 +01:00
cnrs.fr.txt Added cnrs.fr.txt (#876) 2021-05-04 08:58:59 +02:00
cntraveller.com.txt Update cntraveller.com.txt 2023-01-13 01:17:57 +01:00
coalicionporelevangelio.org.txt Create coalicionporelevangelio.org.txt 2022-10-25 11:26:55 +02:00
code.activestate.com.txt
code.google.com.txt
codebase64.org.txt
codeproject.com.txt
codinghorror.com.txt
codyhosterman.com.txt Create codyhosterman.com.txt (#867) 2021-03-17 20:58:13 +01:00
coffeecircle.com.txt
cohost.org.txt Add cohost.org ko-fi.com and pcgamer.com (#1364) 2024-04-13 16:18:23 +02:00
cointelegraph.com.txt Add cointelegraph.com.txt (#881) 2021-05-14 00:47:34 +02:00
collective-evolution.com.txt
collegehumor.com.txt
columbiaspectator.com.txt Create columbiaspectator.com.txt (#1434) 2024-09-23 20:40:54 +02:00
come-on.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
commentarymagazine.com.txt Update commentarymagazine.com.txt 2021-02-13 21:44:12 +01:00
commitstrip.com.txt
commondreams.org.txt Update commondreams.org.txt (#1766) 2025-10-01 21:45:22 +02:00
commonwealmagazine.org.txt Create commonwealmagazine.org.txt 2023-02-18 01:05:41 +01:00
communities-dominate.blogs.com.txt
community.element14.com.txt Create community.element14.com.txt (#1473) 2024-10-31 00:08:32 +01:00
community.lucid.co.txt Comment out callout strip rule in community.lucid.co.txt 2026-02-10 19:19:44 +01:00
community.openstreetmap.org.txt Update community.openstreetmap.org.txt (#1452) 2024-10-21 02:10:58 +02:00
community.readeck.org.txt Create community.readeck.org.txt (#1603) 2025-05-03 08:23:02 +02:00
community.silverbullet.md.txt add community.silverbullet.md (#1844) 2026-01-09 21:45:13 +01:00
composer.spitfireaudio.com.txt Update composer.spitfireaudio.com.txt 2021-07-10 02:01:39 +02:00
computerbase.de.txt
computerworld.com.txt Idg (#1786) 2025-11-12 16:15:45 +01:00
computerworld.dk.txt
consortiumnews.com.txt Create consortiumnews.com.txt 2021-05-03 12:34:38 +02:00
consumerreports.org.txt Create consumerreports.org.txt 2024-06-13 17:05:22 +02:00
contexte.com.txt
contrepoints.org.txt
cooking.nytimes.com.txt Create cooking.nytimes.com.txt 2021-01-06 16:36:08 +01:00
cooper.com.txt
core77.com.txt
correctiv.org.txt Update correctiv.org.txt 2022-05-24 00:27:52 +02:00
costanachrichten.com.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
counterpunch.org.txt
countrylife.co.uk.txt Create countrylife.co.uk.txt (#1469) 2024-10-30 02:34:42 +01:00
courrierdesbalkans.fr.txt
courrierdeuropecentrale.fr.txt
courrierinternational.com.txt Update courrierinternational.com.txt (#1391) 2024-06-15 01:10:47 +02:00
creteinsider.com.txt Create creteinsider.com.txt (#1705) 2025-07-06 23:57:38 +02:00
crikey.com.au.txt crikey.com.au.txt: Initial commit (#862) 2021-03-07 00:08:29 +01:00
crimemagazine.com.txt
crimereads.com.txt Update crimereads.com.txt (#1073) 2023-03-31 16:10:56 +02:00
crimethinc.com.txt
criterion.com.txt Update criterion.com.txt 2022-08-20 00:12:35 +02:00
crn.de.txt
crunchyroll.com.txt
csmonitor.com.txt
csnphilly.com.txt
csoonline.com.txt Idg (#1786) 2025-11-12 16:15:45 +01:00
css-tricks.com.txt
csswizardry.com.txt
ctxt.es.txt Create ctxt.es.txt (#1566) 2025-03-11 14:57:47 +01:00
cucharasonica.com.txt
cultofmac.com.txt
culturebd.com.txt
cw.com.tw.txt
cwnp.com.txt
cyrille-borne.com.txt
da.feedsportal.com.txt
dadall.info.txt
dafoster.net.txt Create dafoster.net.txt (#1151) 2023-07-12 06:22:32 +02:00
dagogtid.no.txt
daily-osm-tips.getsendstack.com.txt add config for daily-osm-tips.getsendstack.com (#1009) 2022-11-02 10:01:43 +01:00
dailydot.com.txt
dailykos.com.txt
dailymail.co.uk.txt Update dailymail.co.uk.txt 2022-04-03 11:47:20 +02:00
dailymaverick.co.za.txt Update dailymaverick.co.za.txt 2021-05-03 13:08:26 +02:00
dailymotion.com.txt
dailynord.fr.txt
dailysabah.com.txt
dailyshincho.jp.txt add dailyshincho.jp.txt and shueisha.online.txt (#1824) 2025-12-19 13:36:11 +01:00
dailystar.com.lb.txt
dallasnews.com.txt Update dallasnews.com.txt (#1406) 2024-07-18 07:15:03 +02:00
danbooru.donmai.us.txt Create danbooru.donmai.us.txt (#1356) 2024-03-26 22:22:23 +01:00
danburzo.ro.txt Add metadata to danburzo.ro.txt (#1848) 2026-01-13 11:39:46 +01:00
danluu.com.txt Fix body extraction for https://danluu.com (#1767) 2025-10-02 09:41:12 +02:00
dansdata.com.txt
dantri.com.vn.txt
daringfireball.net.txt
daserste.ndr.de.txt
dasgelbeblatt.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
datasecuritybreach.fr.txt Create datasecuritybreach.fr.txt (#1642) 2025-06-06 09:28:17 +02:00
davidwalsh.name.txt
dazeddigital.com.txt Update dazeddigital.com.txt 2023-10-04 14:38:27 +02:00
dbazi.com.txt
dcurt.is.txt
deadline.com.txt
deadspin.com.txt Update deadspin.com.txt 2022-12-04 22:39:03 +01:00
declassifieduk.org.txt Create declassifieduk.org.txt 2023-05-24 21:51:58 +02:00
defenseone.com.txt Update defenseone.com.txt 2022-03-11 11:45:57 +01:00
deia.com.txt
deichstube.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
deliverydoubled.com.txt
delong.typepad.com.txt
democracynow.org.txt
demorgen.be.txt Update demorgen.be.txt (#1626) 2025-05-26 11:14:59 +02:00
denikn.cz.txt Update denikn.cz.txt with new paywall detection selector (#1559) 2025-02-27 11:41:57 +01:00
denofgeek.com.txt Create denofgeek.com.txt (#1737) 2025-08-07 16:04:20 +02:00
der-postillon.com.txt
derbund.ch.txt
derekseaman.com.txt Create derekseaman.com.txt (#1472) 2024-10-30 23:15:33 +01:00
derstandard.at.txt Standard2 (#1721) 2025-07-21 08:46:30 +02:00
derstandard.de.txt Standard2 (#1721) 2025-07-21 08:46:30 +02:00
des-livres-pour-changer-de-vie.fr.txt
designsponge.com.txt
designtagebuch.de.txt
deutsche-apotheker-zeitung.de.txt
dev.to.txt
devblogs.microsoft.com.txt add devblogs.microsoft.com (#804) 2020-09-17 08:44:07 +02:00
developer.mozilla.org.txt Update developer.mozilla.org.txt 2023-11-14 17:09:36 +01:00
developers.facebook.com.txt
devlinsangle.blogspot.co.at.txt
dezeen.com.txt Add dezeen.com.txt (#1207) 2023-09-22 17:38:14 +02:00
diagonalperiodico.net.txt
diamond-rm.net.txt add diamond-rm.net.txt (#1857) 2026-01-28 12:41:27 +01:00
diamond.jp.txt add diamond.jp (#1850) 2026-01-18 14:23:57 +01:00
dice.com.txt Create dice.com.txt 2023-11-18 18:05:42 +01:00
dictionary.reference.com.txt
diepresse.com.txt Update diepresse.com.txt (#1466) 2024-10-29 17:16:51 +01:00
digg.com.txt Create digg.com.txt 2022-01-08 12:17:16 +01:00
digiphoto.techbang.com.txt
digital-photography-school.com.txt Updated digital-photography-school.com.txt (#1351) 2024-03-12 13:43:25 +01:00
digitalcourage.de.txt Create digitalcourage.de.txt (#806) 2020-09-17 20:26:48 +02:00
digitalfernsehen.de.txt Update digitalfernsehen.de.txt 2023-10-20 15:43:13 +02:00
digitalforensics.com.txt
digitalkamera.de.txt add digitalkamera.de.txt for multipage fetching (#1288) 2024-01-01 00:26:58 +01:00
digitalspy.co.uk.txt
dilbert.com.txt
dinamalar.com.txt
disclose.ngo.txt Create disclose.ngo.txt (#1526) 2024-12-12 01:12:42 +01:00
discuss.logseq.com.txt Create discuss.logseq.com.txt (#1451) 2024-10-21 01:55:35 +02:00
discuss.pixls.us.txt add discuss.pixls.us.txt (#1782) 2025-11-03 18:23:21 +01:00
dispatchesjournal.org.txt Create dispatchesjournal.org.txt 2020-10-11 14:17:20 +02:00
dissentmagazine.org.txt
distributistreview.com.txt
dn.pt.txt
dobreprogramy.pl.txt
doc.rust-lang.org.txt fix issue: wallabag/wallabag/issues/7854 (#1506) 2024-11-24 20:31:19 +01:00
doc.rust-lang.ru.txt fix issue: wallabag/wallabag/issues/7854 (#1506) 2024-11-24 20:31:19 +01:00
doc.wallabag.org.txt
docs.cloud.google.com.txt Update docs.cloud.google.com.txt 2026-02-15 19:44:30 +01:00
docs.opnsense.org.txt Create docs.opnsense.org.txt (#1283) 2023-12-28 16:32:13 +01:00
dodgersway.com.txt Create dodgersway.com.txt (#1012) 2022-11-28 09:50:06 -08:00
domo-blog.fr.txt Create domo-blog.fr.txt (#1536) 2024-12-21 21:16:49 +01:00
domusweb.it.txt
donnahay.com.au.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
dorkly.com.txt
dou.ua.txt
douban.com.txt
doughellmann.com.txt
dpreview.com.txt
dr-b.io.txt add dr-b.io (#1100) 2023-06-19 09:07:00 +02:00
dr.dk.txt Update dr.dk.txt 2024-06-05 23:20:18 +02:00
drdobbs.com.txt
drgoulu.com.txt
drive2.ru.txt
dropbox.com.txt
drupal.org.txt
dummies.com.txt Create dummies.com.txt (#1163) 2023-07-21 06:25:41 +02:00
dushumashang.com.txt
dw.com.txt fix: dw.com body rule and add date (#1888) 2026-02-24 06:43:06 +01:00
dzone.com.txt
earther.com.txt
earvingad.github.io.txt Create earvingad.github.io.txt (#1753) 2025-08-30 09:01:18 +02:00
eastoftheweb.com.txt
eatsmarter.de.txt fix: eatsmarter single_page_link (#1889) 2026-02-24 06:45:42 +01:00
ebay.com.txt
ecetia.com.txt
echo-online.de.txt Update echo-online.de.txt 2023-10-15 15:00:22 +02:00
echo24.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
eckerd.edu.txt Create eckerd.edu.txt (#841) 2021-01-07 07:48:57 +01:00
econlog.econlib.org.txt
economichardship.org.txt Create economichardship.org.txt (#1578) 2025-03-30 22:27:22 +02:00
economie.gouv.fr.txt
economist.com.txt Update economist.com.txt (#1523) 2024-12-09 09:30:00 +01:00
ecranlarge.com.txt
edge-online.com.txt
edge.org.txt
edition.channel5belize.com.txt
edition.cnn.com.txt fix: cnn selectors (#1898) 2026-02-26 19:38:20 +01:00
edmunds.com.txt Create edmunds.com.txt (#1644) 2025-06-07 09:26:20 +02:00
edn.com.txt Create edn.com (#1138) 2023-07-06 22:37:46 +02:00
eetimes.com.txt
eff.org.txt eff.org: wrap quotes in blockquote (#912) 2021-10-29 22:36:48 +02:00
einfach-tasty.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
ekantipur.com.txt Update ekantipur.com.txt 2020-08-26 17:57:31 +02:00
ekultura.hu.txt
elance.com.txt
elblogsalmon.com.txt
elconfidencial.com.txt Update elconfidencial.com.txt 2022-06-06 12:52:44 +02:00
elderscrollsonline.com.txt
eleconomista.es.txt Update eleconomista.es (#816) 2020-10-02 15:25:46 +02:00
electrek.co.txt Add electrek.co.txt (#850) 2021-01-17 23:54:20 +01:00
electromaker.io.txt
elektroautomobil.com.txt add elektroautomobil.com (#1303) 2024-01-11 02:30:33 +01:00
elektroniknet.de.txt Update elektroniknet.de.txt (#1358) 2024-04-07 04:36:49 +02:00
elementor.contentlabs.ca.txt Create elementor.contentlabs.ca.txt 2022-12-20 00:02:09 +01:00
elespanol.com.txt Update elespanol.com (#1003) 2022-11-02 09:57:42 +01:00
elfster.com.txt add elfster.com to remove ads (#1096) 2023-06-19 09:05:43 +02:00
elmalpensante.com.txt
elmundo.es.txt Update elmundo.es (#1002) 2022-11-02 09:57:22 +01:00
elpais.com.txt Elpais (#1561) 2025-02-28 05:15:51 +01:00
eltonjohn.com.txt Create eltonjohn.com.txt (#1589) 2025-04-16 08:17:16 +02:00
emaratalyoum.com.txt
en.espnf1.com.txt
engadget.com.txt
engineering.tumblr.com.txt
english.aljazeera.net.txt
enikos.gr.txt
enterprisersproject.com.txt
entertainment.timesonline.co.uk.txt
entheogenesis.org.txt Entheogenesis (#1430) 2024-09-02 14:29:31 +02:00
entrepreneurshandbook.co.txt Update entrepreneurshandbook.co.txt (#1170) 2023-07-25 06:45:09 +02:00
entwickler.de.txt
enviscope.com.txt
erdorin.org.txt Adding rules for erdorin.org (#1873) 2026-02-20 16:34:27 +01:00
ericsuh.com.txt
ernestmag.fr.txt
escapistmagazine.com.txt
esglobal.org.txt
espacepolitique.revues.org.txt
espn.go.com.txt
esquire.com.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
esslinger-zeitung.de.txt 9 new MHS-Digital sites (#1088) 2023-06-09 06:18:07 +02:00
essonneinfo.fr.txt
estadao.com.br.txt
eternabuenosaires.com.txt
euractiv.com.txt fix: euractiv.com (#1867) 2026-02-10 14:08:22 +01:00
euractiv.fr.txt Add euractiv.fr.txt (#1066) 2023-03-19 20:48:29 +01:00
eurogamer.net.txt Improvements to eurogamer.net, heise.de, rockpapershotgun.com, tagesschau.de and zeit.de. Fix golem.de (#936) 2022-02-28 06:39:51 +01:00
everway.com.txt Add content extraction rules for everway.com 2025-12-22 23:22:07 +01:00
everydayfeminism.com.txt
evo.co.uk.txt
eweek.com.txt
exoplanets.nasa.gov.txt Add exoplanets.nasa.gov.txt (#949) 2022-03-08 06:17:57 +01:00
explainthatstuff.com.txt Create explainthatstuff.com.txt 2021-02-01 13:57:24 +01:00
explosm.net.txt Update explosm.net.txt (#991) 2022-09-02 07:02:17 +02:00
expresso.sapo.pt.txt
extracine.com.txt
extratipp.com.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
f-droid.org.txt chore: add body and date rules for f-droid (#1894) 2026-02-24 15:09:08 +01:00
facebook.com.txt
facta.co.jp.txt
factuel.info.txt
fair.org.txt
fairphone.com.txt Added fairphone.com.txt (#812) 2020-10-02 12:43:51 +02:00
fakirpresse.info.txt
falter.at.txt
fanfiction.net.txt
fastcompany.com.txt Update fastcompany.com.txt 2021-05-19 22:52:48 +02:00
fathers.pl.txt Fix test_url entry in fathers.pl.txt (#1861) 2026-01-31 16:52:08 +01:00
favouritehumandesign.com.txt Create favouritehumandesign.com.txt (#1654) 2025-06-08 19:53:06 +02:00
faz.net.txt Add new test URL and update strip attributes (#1831) 2025-12-28 22:14:11 +01:00
feeds.feedblitz.com.txt
fehmarn24.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
feinschwarz.net.txt Create feinschwarz.net.txt (#1734) 2025-08-04 00:07:00 +02:00
fernbahntunnel-frankfurt.de.txt Add files via upload (#1419) 2024-08-08 14:48:18 +02:00
fertigung.de.txt
fictionpress.com.txt
ficwad.com.txt
fidelitydigitalassets.com.txt Update fidelitydigitalassets.com.txt (#1700) 2025-07-04 09:36:30 +02:00
fiftytwo.in.txt Create fiftytwo.in.txt 2020-10-20 11:02:39 +02:00
filamentgroup.com.txt
filmstarts.de.txt
finance.yahoo.co.jp.txt add finance.yahoo.co.jp.txt and topnews.jp.txt (#1856) 2026-01-26 19:21:42 +01:00
findtheswagger.tumblr.com.txt
finexpert.e15.cz.txt
fingerprint.ippen.media.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
firstmonday.org.txt Create firstmonday.org.txt (#1291) 2024-01-05 01:31:42 +01:00
firstthings.com.txt
fivebooks.com.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
fivefilters.org.txt
fivethirtyeight.com.txt
flyingmachinestudios.com.txt
fm4.orf.at.txt
fmhy.net.txt Update fmhy.net.txt (#1353) 2024-03-12 17:02:25 +01:00
fnal.gov.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
fnp.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
focus-numerique.com.txt
focus.de.txt Focus.de (#1390) 2024-06-15 00:28:21 +02:00
fok.nl.txt
fokus.se.txt
foley.com.txt Update foley.com.txt 2020-08-26 01:55:02 +02:00
folklore.org.txt
food.com.txt
fool.com.txt
forbes.com.txt Update polygon.com and forbes.com (#843) 2021-01-08 13:36:50 +01:00
forbesjapan.com.txt Add replace(h2) and use strip id or class (#1828) 2025-12-22 09:34:12 +01:00
forbiddenstories.org.txt Create forbiddenstories.org.txt 2023-04-19 17:35:11 +02:00
foreignaffairs.com.txt Update foreignaffairs.com.txt 2022-04-09 16:44:49 +02:00
foreignpolicy.com.txt Update foreignpolicy.com.txt 2024-12-13 14:06:52 +01:00
formula1.com.txt Create formula1.com.txt (#1579) 2025-03-31 00:12:39 +02:00
forsvaret.no.txt
fortelabs.co.txt Update fortelabs.co.txt 2021-04-22 15:35:42 +02:00
forum.revvox.de.txt add forum.revvox.de.txt (#1783) 2025-11-03 18:24:31 +01:00
forward.com.txt Create forward.com.txt 2023-11-18 17:44:56 +01:00
fossbytes.com.txt
foxnews.com.txt
fr.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
framablog.org.txt
france24.com.txt
franceculture.fr.txt
franceinfo.fr.txt Update franceinfo.fr.txt (#1761) 2025-09-19 22:42:44 +02:00
frandroid.com.txt Update frandroid.com.txt 2026-02-06 17:26:21 +01:00
frankenpost.de.txt 9 new MHS-Digital sites (#1088) 2023-06-09 06:18:07 +02:00
frankwatching.com.txt chore: rename frankwatching and add body rule (#1887) 2026-02-23 19:42:53 +01:00
freecodecamp.org.txt Create freecodecamp.org.txt (#935) 2022-02-21 14:26:04 +01:00
freelancer.com.txt
freemovement.org.uk.txt Create freemovement.org.uk.txt (#1417) 2024-08-08 14:00:12 +02:00
fria.nu.txt
friatidningen.se.txt
frmplus.de.txt Add files via upload (#1419) 2024-08-08 14:48:18 +02:00
fromreformationtoreformation.com.txt Create fromreformationtoreformation.com.txt (#1622) 2025-05-21 08:21:45 +02:00
frontburner.dmagazine.com.txt
frontpagelinux.com.txt add frontpagelinux.com (#811) 2020-10-01 21:16:08 +02:00
fs.blog.txt Create fs.blog.txt 2022-08-12 01:40:21 +02:00
ft.com.txt Update ft.com.txt (#1343) 2024-02-21 22:18:51 +01:00
ftchinese.com.txt updated ftchinese.com.txt (#836) 2020-12-28 16:26:00 +01:00
fularsizentellik.com.txt
fuldaerzeitung.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
funnyjunk.com.txt Update funnyjunk.com.txt 2021-03-14 13:51:25 +01:00
futura-sciences.com.txt
futurezone.at.txt
futurism.com.txt fix: futurism.com.txt (#1870) 2026-02-18 16:12:17 +01:00
fzone.cz.txt
gamasutra.com.txt
gameblog.fr.txt
gamedev.net.txt
gamekult.com.txt Updated gamekult.com.txt (#875) 2021-04-30 18:40:23 +02:00
gamer.no.txt
gamereactor.no.txt
gamesradar.com.txt Create gamesradar.com.txt (#1606) 2025-05-05 06:53:46 +02:00
gameswirtschaft.de.txt
ganglia.info.txt
gatesnotes.com.txt Update gatesnotes.com.txt (#1499) 2024-11-17 09:52:42 +01:00
gatopardo.com.txt
gauchiste.fr.txt
gawker.com.txt
geeksofdoom.com.txt
geenstijl.nl.txt
gendai.media.txt add gendai.media.txt xenospectrum.com.txt taxacc.jp.txt (#1821) 2025-12-17 05:44:08 +01:00
generation-nt.com.txt Update generation-nt.com.txt (#1025) 2023-01-12 15:57:19 +01:00
germangirlinamerica.com.txt Create germangirlinamerica.com.txt (#1137) 2023-07-06 11:18:31 +02:00
geschichtedergegenwart.ch.txt Create geschichtedergegenwart.ch.txt (#1183) 2023-08-20 12:58:25 +02:00
getnews.jp.txt
getpocket.com.txt Update getpocket.com.txt 2021-05-12 22:01:55 +02:00
ghanaweb.com.txt
giantbomb.com.txt
giessener-allgemeine.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
giga.de.txt
gigaom.com.txt
gihyo.jp.txt
gist.github.com.txt
git-scm.com.txt
github.blog.txt
github.com.txt Fix XPath selector for body content 2026-01-17 20:15:34 +01:00
gizmodo.com.txt fix: bad test_contains directives (#1874) 2026-02-20 18:09:13 +01:00
gizmodo.uol.com.br.txt
gizmologia.com.txt
gizmovil.com.txt
glasnaya.media.txt Update glasnaya.media.txt 2023-11-09 16:55:29 +01:00
glazman.org.txt
global.txt Update global.txt 2020-10-24 11:42:27 +02:00
globalgrind.com.txt fix: globalgrind.com rules (#1899) 2026-02-26 18:32:00 +01:00
globalissues.org.txt
globalresearch.ca.txt
gloswielkopolski.pl.txt
gnppn.fr.txt
gnu.org.txt fix: allow getting body from man pages for gnu.org (#1890) 2026-02-24 06:46:35 +01:00
gnz.de.txt Create gnz.de.txt (#1535) 2024-12-21 14:06:24 +01:00
goal.com.txt
gocomics.com.txt
gofugyourself.com.txt
gokulkrishh.github.io.txt
gold.ac.uk.txt
goldseiten.de.txt Update goldseiten.de.txt (#1056) 2023-02-19 22:39:21 +01:00
golem.de.txt Update cookie consent value in golem.de.txt (#1891) 2026-02-24 06:47:36 +01:00
good.is.txt
goodfil.ms.txt
goodhousekeeping.com.txt Create goodhousekeeping.com.txt (#1742) 2025-08-10 05:36:56 +02:00
goodreads.com.txt
gorky.media.txt Create gorky.media.txt 2020-09-03 13:12:52 +02:00
gossip-tv.gr.txt
goteborgsfria.se.txt
gothamist.com.txt
gov.uk.txt Update gov.uk.txt 2022-10-16 10:54:58 +02:00
gp.se.txt
gq-magazine.co.uk.txt Update gq-magazine.co.uk.txt (#1202) 2023-09-11 21:13:36 +02:00
gq.com.txt
grafikart.fr.txt
granta.com.txt Update granta.com.txt 2021-03-26 21:46:36 +01:00
grantland.com.txt
greatergreaterwashington.org.txt
greaterwrong.com.txt Create greaterwrong.com.txt 2020-10-16 20:41:55 +02:00
greensavers.sapo.pt.txt Create greensavers.sapo.pt.txt (#985) 2022-07-19 15:13:38 +02:00
grisebouille.net.txt Grisebouille (#1881) 2026-02-23 14:25:31 +01:00
groene.nl.txt Update groene.nl.txt (#1158) 2023-07-17 11:28:59 +02:00
groups.drupal.org.txt
grubstreet.com.txt
grumpygamer.com.txt
gsmarena.com.txt
gulfnews.com.txt
guokr.com.txt
gurumed.org.txt
gurusblog.com.txt
gutenberg.org.txt Update gutenberg.org.txt 2026-01-29 12:51:47 +01:00
guyaweb.com.txt
haaretz.co.il.txt Create haaretz.co.il.txt (#1069) 2023-03-23 14:35:43 +01:00
haaretz.com.txt Update haaretz.com.txt 2025-06-25 17:37:36 -04:00
haberler.com.txt
habr.com.txt Update habr.com.txt (#1470) 2024-10-30 03:36:19 +01:00
habrahabr.ru.txt
hacf.fr.txt Create hacf.fr.txt (#1704) 2025-07-06 23:11:27 +02:00
hackersrepublic.org.txt
hackertarget.com.txt
hackmake.org.txt
hackneycitizen.co.uk.txt Update hackneycitizen.co.uk.txt 2021-05-26 00:55:08 +02:00
hacks.mozilla.org.txt
hallo-muenchen.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
halo.bungie.org.txt
hanau-wuerzburg-fulda.de.txt Add files via upload (#1419) 2024-08-08 14:48:18 +02:00
hanauer.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
handelsblatt.com.txt fix: bad format errors (#1811) 2025-12-09 13:48:06 +01:00
hanselman.com.txt
happyassassin.net.txt
hardware.fr.txt
hardware.no.txt
hardwareluxx.de.txt update hardwareluxx (#1809) 2025-12-08 01:31:34 +01:00
harpers.org.txt Update harpers.org.txt 2023-01-17 22:25:58 +01:00
harzkurier.de.txt Funke (#1373) 2024-04-30 21:37:22 +02:00
has-sante.fr.txt Create has-sante.fr.txt (#1719) 2025-07-20 10:30:26 +02:00
hazlitt.net.txt
hbr.org.txt Refactor selectors and strip rules in hbr.org.txt (#1906) 2026-03-03 00:25:05 +01:00
headrush.typepad.com.txt
health.com.txt
health.gov.au.txt
healthland.time.com.txt
healthletter.mayoclinic.com.txt
healthline.com.txt Update healthline.com.txt 2023-01-27 15:11:56 +01:00
heatmap.news.txt Shtrom 2024 01 (#1326) 2024-01-31 16:15:41 +01:00
heidelberg24.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
heise.de.txt Add strip rule for SVG images (#1855) 2026-01-24 14:29:24 +01:00
hellofresh.de.txt Create hellofresh.de.txt (#1368) 2024-04-25 18:58:10 +02:00
help.fivefilters.org.txt
help.sharegate.com.txt Create help.sharegate.com.txt 2025-09-11 16:17:44 +02:00
hemmings.com.txt
herbstfest-rosenheim.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
hersfelder-zeitung.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
hespress.com.txt
hessen.de.txt Update hessen.de.txt (#1179) 2023-08-12 10:19:02 +02:00
hessenschau.de.txt Update hessenschau.de.txt (#1714) 2025-07-15 11:10:29 +02:00
higcapital.com.txt
highscalability.com.txt
hiiraan.com.txt
hillstreetgrocer.com.txt
hindustantimes.com.txt fix: hindustantimes rules (#1901) 2026-02-26 18:41:59 +01:00
hiperpop.com.txt
hipertextual.com.txt
hiphopleeft.nl.txt
histoire-filante.fr.txt
histoire.presse.fr.txt
historic-uk.com.txt
historytoday.com.txt
hln.be.txt
hmercer.com.txt
hna.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
hochheimer-zeitung.de.txt Update hochheimer-zeitung.de.txt 2023-10-15 15:02:12 +02:00
hodinkee.com.txt Update hodinkee.com.txt 2022-01-16 13:01:15 +01:00
hollywoodlife.com.txt
homeofsports.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
hometheaterreview.com.txt
hosted.ap.org.txt
hosted2.ap.org.txt
houstonchronicle.com.txt
howtogeek.com.txt Update howtogeek.com.txt (#1487) 2024-11-09 09:25:03 +01:00
hpd.de.txt Create hpd.de.txt (#1439) 2024-10-10 22:13:26 +02:00
hs.fi.txt Update hs.fi (#1710) 2025-07-14 20:11:24 +02:00
ht.ly.txt
huffingtonpost.co.uk.txt
huffingtonpost.fr.txt Update huffingtonpost.fr.txt 2025-07-24 14:22:13 +02:00
huffpost.com.txt Update huffpost.com.txt 2020-08-24 18:25:17 +02:00
humanite.fr.txt
humantransit.org.txt
hurriyet.com.tr.txt
hvg.hu.txt
hypebeast.com.txt
ianlewis.org.txt
iansommerville.com.txt
icannabis.tumblr.com.txt
ichkoche.at.txt Create ichkoche.at.txt (#1324) 2024-01-28 10:26:58 +01:00
ici.radio-canada.ca.txt Update ici.radio-canada.ca.txt (#1732) 2025-08-02 15:28:36 +02:00
idealog.co.nz.txt
idlewords.com.txt
ieeexplore.ieee.org.txt Update ieeexplore.ieee.org.txt 2023-10-15 08:15:49 +02:00
ietf.org.txt Create ietf.org.txt 2023-11-04 10:46:47 +01:00
igen.fr.txt Updated igen.fr.txt (#1868) 2026-02-14 09:38:35 +01:00
igeneration.fr.txt
ikz-online.de.txt Funke (#1373) 2024-04-30 21:37:22 +02:00
ilounge.com.txt
ilsoftware.it.txt Ilsoftware (#1645) 2025-06-07 12:59:11 +02:00
ilyabirman.ru.txt
immub.org.txt Update immub.org.txt (#1518) 2024-12-05 14:20:13 +01:00
imore.com.txt Update imore.com.txt 2023-04-02 00:38:51 +02:00
in-muenchen.de.txt changed 57 files for ippen.media sites (#1383) 2024-06-05 12:06:20 +02:00
inc.com.txt Update inc.com.txt (#1703) 2025-07-05 06:51:08 +02:00
indehekken.net.txt
independent.co.uk.txt Update independent.co.uk.txt 2021-08-29 01:24:25 +02:00
indiatimes.com.txt
indiehackers.com.txt
indiewire.com.txt Update indiewire.com.txt (#1739) 2025-08-08 13:35:53 +02:00
indiscreto.org.txt Create indiscreto.org.txt (#1655) 2025-06-11 17:23:59 +02:00
inessential.com.txt
infolibre.es.txt Add infolibre.es (#970) 2022-05-18 06:58:32 +02:00
infoq.com.txt
informador.com.mx.txt
information.dk.txt
informationarchitects.net.txt
informationclearinghouse.info.txt
informit.com.txt
infovaticana.com.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
infoworld.com.txt Idg (#1786) 2025-11-12 16:15:45 +01:00
infzm.com.txt
ingame.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
inhabitat.com.txt
innsalzach24.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
inquirer.com.txt Update inquirer.com.txt 2021-04-09 11:31:07 +02:00
inquirer.net.txt Update inquirer.net.txt 2024-06-07 12:32:00 +02:00
instagr.am.txt
instagram.com.txt Create instagram.com.txt (#1675) 2025-06-19 07:34:49 +02:00
instructables.com.txt Update instructables.com.txt 2021-03-05 19:42:36 +01:00
insuedthueringen.de.txt add config for insuedthueringen.de (#1086) 2023-06-09 06:17:39 +02:00
intelligenceonline.fr.txt
interconnected.org.txt Create interconnected.org.txt (#1481) 2024-11-04 02:01:46 +01:00
interestingengineering.com.txt add config for interestingengineering.com (#986) 2022-08-12 00:22:22 +02:00
intern-mag.com.txt Create intern-mag.com.txt 2022-02-04 00:26:54 +01:00
interviewmagazine.com.txt
investigation.rollingstone.com.txt Update investigation.rollingstone.com.txt 2023-02-05 20:40:27 +01:00
investopedia.com.txt Update investopedia.com.txt 2021-06-20 18:26:05 +02:00
inwestomat.eu.txt Create inwestomat.eu.txt (#1057) 2023-02-22 07:55:07 +01:00
ipadclub.nl.txt
ipadplanet.nl.txt
iphon.fr.txt Update iphon.fr.txt 2023-09-21 10:42:33 +02:00
iphoneaddict.fr.txt
iphoneclub.nl.txt
iphonehacks.com.txt
iphonetweak.fr.txt
iplaysoft.com.txt
ishadeed.com.txt Create ishadeed.com (#931) 2022-02-16 22:00:35 +01:00
iso.500px.com.txt
isource.com.txt
ispatguru.com.txt Create ispatguru.com.txt (#1311) 2024-01-19 08:36:06 +01:00
it-connect.fr.txt
italpassion.fr.txt add 3 files (#1829) 2025-12-23 11:30:11 +01:00
itavisen.no.txt
itmedia.co.jp.txt Add replace(h2) and use strip id or class (#1828) 2025-12-22 09:34:12 +01:00
itnews.com.au.txt
itsfoss.com.txt fix: itsfoss rules (#1895) 2026-02-24 15:39:39 +01:00
itstactical.com.txt
itunes.apple.com.txt
itwire.com.txt
izismile.com.txt
jack-vanlightly.com.txt Add initial content for jack-vanlightly.com analysis (#1836) 2026-01-04 08:42:41 +01:00
jacobin.com.br.txt Create scraping configuration for jacobin.com.br (#1780) 2025-10-26 21:25:03 +01:00
jacobin.com.txt Update and rename jacobinmag.com.txt to jacobin.com.txt 2022-10-20 17:01:24 +02:00
jacobnordby.com.txt add jacobnordby.com (#1795) 2025-11-29 19:23:23 +01:00
jalopnik.com.txt The kinja sites updated their engine and now they tag their body content using "js_post-content" instead of just "post-content" (#917) 2021-11-29 19:45:01 +01:00
jamesclear.com.txt
jameslandrith.com.txt
jamieoliver.com.txt
jandan.net.txt
japoninfos.com.txt
javascript.plainenglish.io.txt Create javascript.plainenglish.io.txt (#1146) 2023-07-12 06:27:41 +02:00
jbpress.ismedia.jp.txt add two sites (#1832) 2025-12-29 14:56:41 +01:00
jdubuzz.com.txt
je-suis-papa.com.txt
jesuisundev.com.txt
jetzt.de.txt
jetzt.sueddeutsche.de.txt
jeuxvideo.com.txt
jezebel.com.txt
jjahnke.net.txt
jneurosci.org.txt
jobbank.gc.ca.txt
joelonsoftware.com.txt
johannesbader.ch.txt
johnnysgamelogs.fr.txt
jollinger.com.txt Create jollinger.com.txt 2021-01-06 16:57:26 +01:00
journal.markusthoma.com.txt add journal.markusthoma.com (#1304) 2024-01-11 02:36:36 +01:00
journaldugamer.com.txt
journaldugeek.com.txt
journals.biologists.com.txt Create journals.biologists.com.txt (#1693) 2025-07-02 16:18:45 +02:00
journals.plos.org.txt
journals.sagepub.com.txt Update journals.sagepub.com.txt 2021-04-02 15:40:21 +02:00
joystiq.com.txt
jp.motorsport.com.txt add jp.motorsport.com (#1818) 2025-12-14 13:13:49 +01:00
jp.reuters.com.txt Update xenospectrum.com add newsphere.jp jp.reuters.com (#1847) 2026-01-11 16:34:29 +01:00
jpmens.net.txt Create jpmens.net.txt (#1182) 2023-08-20 12:59:00 +02:00
jsforcats.com.txt
juedische-allgemeine.de.txt
juejin.cn.txt Add juejin.cn config (#916) 2021-11-22 07:04:30 +01:00
juliareda.eu.txt
julieandrieu.com.txt
jungle-world.com.txt
juppy.org.txt
jvns.ca.txt Create jvns.ca.txt (#1587) 2025-04-15 08:59:21 +02:00
jvt.me.txt
kachestvo.ru.txt
kathimerini.gr.txt
kattascha.de.txt
kb.mailbox.org.txt
kenfm.de.txt Create kenfm.de.txt 2020-09-22 09:13:53 +02:00
kenrockwell.com.txt
keyboardmag.com.txt
keycloak.org.txt Create keycloak.org.txt 2021-02-01 13:45:32 +01:00
kicker.de.txt
kickstarter.com.txt
kinder-verstehen.de.txt
kingarthurflour.com.txt
kingstonist.com.txt Create kingstonist.com.txt (#1515) 2024-12-05 10:01:12 +01:00
kingz.fr.txt
kinocheck.de.txt Create kinocheck.de.txt (#1730) 2025-07-31 21:08:29 +02:00
klimareporter.de.txt fix: klimareporter.de rules (#1902) 2026-02-26 18:43:04 +01:00
knoten-stadion.de.txt Add files via upload (#1419) 2024-08-08 14:48:18 +02:00
knowablemagazine.org.txt Create knowablemagazine.org.txt (#1531) 2024-12-16 21:44:01 +01:00
ko-fi.com.txt Add cohost.org ko-fi.com and pcgamer.com (#1364) 2024-04-13 16:18:23 +02:00
kochbar.de.txt fix: kochbar body rule (#1904) 2026-02-26 18:45:54 +01:00
kommersant.ru.txt Update kommersant.ru.txt 2023-11-16 15:50:08 +01:00
kont.me.txt Add a custom user agent to retrieve kont.me (#872) 2021-04-08 19:06:16 +02:00
korben.info.txt Update korben.info.txt (#1729) 2025-07-30 20:55:48 +02:00
kotaku.com.txt The kinja sites updated their engine and now they tag their body content using "js_post-content" instead of just "post-content" (#917) 2021-11-29 19:45:01 +01:00
kottke.org.txt
kqed.org.txt Create kqed.org.txt (#1496) 2024-11-16 11:24:48 +01:00
krautreporter.de.txt krautreporter: fix retrieving images (#997) 2022-10-07 20:36:02 +02:00
krebsonsecurity.com.txt
kreis-anzeiger.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
kreisbote.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
kreiszeitung.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
kresus.org.txt
kriswrites.com.txt Create kriswrites.com.txt (#1144) 2023-07-12 06:28:58 +02:00
krone.at.txt
krzbb.de.txt 9 new MHS-Digital sites (#1088) 2023-06-09 06:18:07 +02:00
kuemmerle.name.txt Create kuemmerle.name.txt (#1448) 2024-10-19 04:03:39 +02:00
kulturegeek.fr.txt
kumailplus.com.txt
kumb.com.txt
kurier.de.txt 9 new MHS-Digital sites (#1088) 2023-06-09 06:18:07 +02:00
kurierverlag.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
kwerfeldein.de.txt
kyoko-np.net.txt add two sites (#1832) 2025-12-29 14:56:41 +01:00
labs.bishopfox.com.txt Config for blogposts at labs.bishopfox.com (#853) 2021-01-23 11:46:37 +01:00
labs.mwrinfosecurity.com.txt
labs.ripe.net.txt Shtrom 2024 03 (#1347) 2024-03-05 11:52:33 +01:00
lactualite.com.txt Update lactualite.com.txt (#1716) 2025-07-18 09:37:35 +02:00
lado.mx.txt Create lado.mx.txt 2024-02-12 16:09:33 -06:00
lalettrea.fr.txt
lalibre.be.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
lampertheimer-zeitung.de.txt Update lampertheimer-zeitung.de.txt 2023-10-15 15:02:55 +02:00
landetsfria.se.txt
landtiere.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
lapetiteokara.fr.txt Extract content structure from lapetiteokara.fr (#1837) 2026-01-04 09:20:28 +01:00
laphamsquarterly.org.txt Update laphamsquarterly.org.txt 2021-12-19 11:54:06 +01:00
lapin-blanc.blogs.docteo.net.txt
lapresse.ca.txt Update XPath selectors and test URLs in lapresse.ca.txt (#1853) 2026-01-24 09:16:45 +01:00
lapresselibre.info.txt LPL (#1788) 2025-11-18 21:24:33 +01:00
laquadrature.net.txt
lareviewofbooks.org.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
larevuedesmedias.ina.fr.txt Add larevuedesmedias.ina.fr (#849) 2021-01-17 23:04:51 +01:00
latimes.com.txt Update latimes.com.txt 2020-10-24 11:47:58 +02:00
laughingsquid.com.txt
lauterbacher-anzeiger.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
lavenir.net.txt Add support for lavenir.net (#1871) 2026-02-20 07:30:17 +01:00
lawfareblog.com.txt
leancrew.com.txt
learn.microsoft.com.txt fix: msdn rules (#1903) 2026-02-26 18:45:06 +01:00
leb.fbi.gov.txt Update leb.fbi.gov.txt 2022-11-09 00:42:06 +01:00
leblogduhacker.fr.txt
lececil.org.txt
lecker.de.txt
ledauphine.com.txt Create ledauphine.com.txt 2024-06-09 22:50:39 +02:00
ledoc-info.com.txt
leereamsnyder.com.txt Create leereamsnyder.com.txt (#1425) 2024-08-24 10:26:56 +02:00
lefigaro.fr.txt Update lefigaro.fr.txt 2025-02-19 17:01:04 +01:00
lefilrouge.media.txt
legrandcontinent.eu.txt XPath updates for legrandcontinent.eu (#1793) 2025-11-25 07:30:03 +01:00
lehollandaisvolant.net.txt
leinetal24.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
lejournal.cnrs.fr.txt
lemmy.ml.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
lemonde.fr.txt Fix log-in for wallabag (#1865) 2026-02-06 15:31:23 +01:00
lenta.ru.txt
leon.jp.txt add leon.jp (#1826) 2025-12-20 11:47:29 +01:00
lepoint.fr.txt
lequatreheures.com.txt
lequipe.fr.txt Add lequipe.fr (#827) 2020-11-17 12:03:05 +01:00
lesecolohumanistes.fr.txt Add lesecolohumanistes.fr.txt (#877) 2021-05-04 08:58:30 +02:00
lesjours.fr.txt Lesjours (#1533) 2024-12-17 15:53:04 +01:00
lesnumeriques.com.txt Fix lesnumeriques.com (#1723) 2025-07-26 09:59:54 +02:00
lesoir.be.txt Create lesoir.be.txt (#819) 2020-10-08 20:31:45 +02:00
lesprosdelapetiteenfance.fr.txt Create lesprosdelapetiteenfance.fr.txt 2023-08-25 09:49:18 +02:00
lesswrong.com.txt Create lesswrong.com.txt 2021-01-22 22:24:10 +01:00
letraslibres.com.txt
lexpress.fr.txt Create lexpress.fr.txt (#1440) 2024-10-10 22:41:03 +02:00
lezephyrmag.com.txt
libcom.org.txt
liberation.fr.txt Update liberation.fr.txt 2025-10-13 10:43:44 +02:00
LICENSE.txt
lifeclub.org.txt Create lifeclub.org.txt (#1127) 2023-07-03 09:28:58 +02:00
lifehack.org.txt
lifehacker.com.txt Update lifehacker.com.txt (#1676) 2025-06-19 19:21:50 +02:00
lifehacker.ru.txt Modify body selector and prevent indentation (#1784) 2025-11-06 09:13:32 +01:00
lifestyle.inquirer.net.txt
lifeweek.com.cn.txt
lightreading.com.txt add site config for lightreading.com (#851) 2021-01-19 14:59:05 +01:00
limo.media.txt add limo.media.txt and news.infoseek.co.jp.txt (#1863) 2026-02-02 20:03:59 +01:00
limprevu.fr.txt
link.springer.com.txt Link.springer (#1325) 2024-01-28 17:38:34 +01:00
linkedin.com.txt Modify LinkedIn scraping configuration (#1802) 2025-12-04 13:07:08 +01:00
linux-community.de.txt Update linux-community.de.txt (#1108) 2023-06-19 16:57:42 +02:00
linux-magazin.de.txt update linux-magazin.de config (#1011) 2022-11-15 04:54:11 +01:00
linux.com.txt
linuxconfig.org.txt Create linuxconfig.org.txt (#1323) 2024-01-28 09:44:57 +01:00
linuxjournal.com.txt
linuxnix.com.txt
literaryreview.co.uk.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
lithub.com.txt Update XPath for article metadata extraction (#1910) 2026-03-04 01:03:44 +01:00
livescience.com.txt Cleanup livescience.com.txt (#848) 2021-01-17 23:04:26 +01:00
longform.org.txt
longreads.com.txt Create longreads.com.txt 2020-09-29 12:23:08 +02:00
longreads.tni.org.txt fix: bad test_contains directives (#1874) 2026-02-20 18:09:13 +01:00
loopinsight.com.txt
lostgarden.com.txt
lotro.com.txt Update lotro.com.txt (#1686) 2025-06-27 15:50:05 +02:00
loudersound.com.txt Create loudersound.com.txt (#1738) 2025-08-08 13:34:34 +02:00
lowtechmagazine.com.txt
lrb.co.uk.txt Update lrb.co.uk.txt 2020-11-05 15:40:04 +01:00
ludwigshafen24.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
lukew.com.txt
luminous-landscape.com.txt
lupa.cz.txt
lux-magazine.com.txt Create lux-magazine.com.txt 2021-05-19 22:42:34 +02:00
luxuo.com.txt
lvsl.fr.txt
lwlies.com.txt Create lwlies.com.txt 2020-09-17 15:52:00 +02:00
lwn.net.txt Fix lwn.net login (#1529) 2024-12-14 13:07:34 +01:00
lynalden.com.txt Create lynalden.com.txt (#1507) 2024-11-27 01:02:04 +01:00
m.bbc.co.uk.txt
m.douban.com.txt
m.dw.com.txt
m.facebook.com.txt
m.theregister.co.uk.txt
m.wikihow.com.txt
m.xkcd.com.txt
m00natic.github.io.txt
mac4ever.com.txt Update mac4ever.com.txt 2023-01-04 15:16:07 +01:00
macdrifter.com.txt
macg.co.txt Update macg.co.txt 2024-10-08 11:09:20 +02:00
macmagazine.com.br.txt
macrumors.com.txt
macstories.net.txt
mactalk.com.au.txt
mactechnews.de.txt
macworld.com.txt
mailchi.mp.txt Update mailchi.mp.txt 2024-10-20 15:38:40 +02:00
main-spitze.de.txt Update main-spitze.de.txt 2023-10-15 15:06:48 +02:00
mainichi.jp.txt add as-web.jp.txt and mainichi.jp.txt (#1822) 2025-12-17 20:01:19 +01:00
mainpost.de.txt
maitre-eolas.fr.txt
make.wordpress.org.txt Create make.wordpress.org.txt 2020-12-29 01:29:14 +01:00
makramayache.com.txt Create www.makramayache.com.txt (#1524) 2024-12-10 22:36:18 +01:00
malekal.com.txt Create malekal.com.txt (#1214) 2023-10-02 06:14:13 +02:00
manager-magazin.de.txt Update manager-magazin.de.txt (#1320) 2024-01-26 19:59:13 +01:00
manager.co.th.txt
manga-news.com.txt
mangfall24.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
mannheim24.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
marciniwuc.com.txt Create marciniwuc.com.txt (#1058) 2023-02-22 13:38:14 +01:00
marco.org.txt
marcobehler.com.txt Create marcobehler.com.txt (#1164) 2023-07-21 06:25:26 +02:00
marcvidal.net.txt
marginalrevolution.com.txt Create marginalrevolution.com.txt (#1563) 2025-03-04 22:11:37 +01:00
marigold.cz.txt
maritimedanmark.dk.txt Create maritimedanmark.dk.txt 2024-06-06 11:27:10 +02:00
marketresearchdirect.com.txt
markmanson.net.txt fix: bad test_contains directives (#1874) 2026-02-20 18:09:13 +01:00
marksdailyapple.com.txt
marktechpost.com.txt Create marktechpost.com.txt (#1449) 2024-10-19 22:12:21 +02:00
marmiton.org.txt
marriedtothesea.com.txt
marsactu.fr.txt
martinfowler.com.txt
mashable.com.txt
matija.suklje.name.txt Create matija.suklje.name.txt (#1486) 2024-11-09 09:18:08 +01:00
matt.might.net.txt
mattcutts.com.txt
matthewball.co.txt Entheogenesis (#1430) 2024-09-02 14:29:31 +02:00
maxim.com.txt
mbari.org.txt
mbk-news.appspot.com.txt Create mbk-news.appspot.com.txt 2020-09-17 16:00:35 +02:00
mbl.is.txt
mccarthy.ca.txt Create mccarthy.ca.txt 2025-07-31 15:05:33 +02:00
mcconnellsmedchem.com.txt Create mcconnellsmedchem.com.txt (#1577) 2025-03-30 07:17:51 +02:00
mcorbin.fr.txt Add mcorbin.fr configuration (#1189) 2023-08-22 00:16:41 +02:00
mdpi.com.txt
mdr.de.txt
mebedo.de.txt
mediacites.fr.txt LPL (#1788) 2025-11-18 21:24:33 +01:00
medialens.org.txt Update medialens.org.txt 2021-05-09 15:32:13 +02:00
mediapart.fr.txt Update mediapart.fr.txt (#1728) 2025-08-01 21:50:11 +02:00
medium.com.txt Medium.com (#1169) 2023-07-25 06:45:51 +02:00
medscape.com.txt Shtrom 2024 03 (#1347) 2024-03-05 11:52:33 +01:00
meduza.io.txt
megamp3.eu.txt
mein-hbf-ffm.de.txt Add files via upload (#1419) 2024-08-08 14:48:18 +02:00
mein-mmo.de.txt
meine-anzeigenzeitung.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
mentalfloss.com.txt
meowni.ca.txt
mercatornet.com.txt Create mercatornet.com.txt (#1329) 2024-02-02 18:37:29 +01:00
mercurynews.com.txt
mereorthodoxy.com.txt Add scraping configuration for mereorthodoxy.com (#1840) 2026-01-07 04:29:22 +01:00
merkur.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
merkurist.de.txt Update merkurist.de.txt 2023-10-08 10:51:03 +02:00
mesec.cz.txt
metafilter.com.txt
metro.co.uk.txt Create metro.co.uk.txt (#1129) 2023-07-03 09:29:37 +02:00
metrocop.net.txt
mforum.cari.com.my.txt
miamiherald.com.txt
microsiervos.com.txt Add microsiervos.com (#1006) 2022-11-02 09:58:44 +01:00
middleeasteye.net.txt Update middleeasteye.net.txt 2023-06-02 13:28:55 +02:00
mikeash.com.txt
mikeindustries.com.txt
milanocittastato.it.txt Create milanocittastato.it.txt (#1671) 2025-06-17 07:35:08 +02:00
minnesota.publicradio.org.txt
minnpost.com.txt
mintpressnews.com.txt
miops.com.txt Add miops.com.txt (#974) 2022-06-01 16:21:58 +02:00
mirrorfootball.co.uk.txt
mises.org.txt
missnumerique.com.txt Create missnumerique.com.txt (#1015) 2022-11-28 22:49:25 +01:00
mithatkonar.com.txt
mitie.com.txt Create mitie.com.txt 2021-09-11 18:35:52 +02:00
mittelhessen.de.txt Update mittelhessen.de.txt 2023-10-15 15:07:14 +02:00
mlb.sbnation.com.txt
mlssoccer.com.txt
mmo-champion.com.txt
mnn.com.txt
mno.hu.txt
mobile.lemondeinformatique.fr.txt
mobile.nytimes.com.txt Nyt (#1427) 2024-08-28 14:10:59 +02:00
mobile.twitter.com.txt twitter.com: fix content fetching using custom UA (#837) 2020-12-28 18:22:14 +01:00
mobilegeeks.de.txt
mobilenet.cz.txt
mobileopportunity.blogspot.com.txt
mobilmania.cz.txt
modernghana.com.txt
momentumsaga.com.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
moncarnet.com.txt Update moncarnet.com.txt (#1731) 2025-08-01 01:32:29 +02:00
monde-diplomatique.fr.txt Update monde-diplomatique.fr.txt (#1727) 2025-07-30 17:30:13 +02:00
money.cnn.com.txt
moneysavingexpert.com.txt
monkeyuser.com.txt
monkeyzen.com.txt
montelimar-news.fr.txt
moo.nac.uci.edu.txt
moonsault.de.txt
morgenpost.de.txt Funke (#1373) 2024-04-30 21:37:22 +02:00
mothering.com.txt
motherjones.com.txt
moto-net.com.txt
motorcyclistonline.com.txt
motorfull.com.txt
motorsport-magazin.com.txt Update motorsport-magazin.com.txt (#1461) 2024-10-27 17:09:51 +01:00
movie.douban.com.txt
mp.weixin.qq.com.txt Update mp.weixin.qq.com.txt (#1629) 2025-05-27 09:03:02 +02:00
msdvetmanual.com.txt fix: bad test_contains directives (#1874) 2026-02-20 18:09:13 +01:00
msn.com.txt Update msn.com.txt (#1346) 2024-03-05 10:35:34 +01:00
msnbc.msn.com.txt
mtlblog.com.txt
muenster.de.txt
multinationales.org.txt Fix multinationales.org.txt (#1773) 2025-10-10 20:46:32 +02:00
muse.jhu.edu.txt muse.jhu.edu.txt added for the journal of democracy (#1548) 2025-01-10 19:43:52 +01:00
muycomputerpro.com.txt Update muycomputerpro.com (#813) 2020-10-02 12:40:01 +02:00
muyinteresante.com.txt Fix w3349 (#1460) 2024-10-26 22:05:39 +02:00
muyinteresante.es.txt
muylinux.com.txt
mymodernmet.com.txt Add mymodernmet.com (#828) 2020-11-17 12:03:29 +01:00
myrecipes.com.txt
mysqlblog.fivefarmers.com.txt add config for mysqlblog.fivefarmers.com (#987) 2022-08-12 00:42:33 +02:00
mytotalretail.com.txt
n-tv.de.txt
n.survol.fr.txt
nachdenkseiten.de.txt
nachrichten.at.txt
naiz.eus.txt
najlepsze-ksiazki.pl.txt Create najlepsze-ksiazki.pl.txt (#1171) 2023-07-25 06:44:37 +02:00
nakedsecurity.sophos.com.txt
narratively.com.txt Update narratively.com.txt 2021-12-30 02:22:40 +01:00
nasa.gov.txt
natalie.mu.txt add 3 files (#1829) 2025-12-23 11:30:11 +01:00
nationalgeographic.de.txt Update nationalgeographic.de.txt 2021-10-29 00:22:12 +02:00
nationalpost.com.txt Update nationalpost.com.txt 2025-06-25 18:00:47 -04:00
nationalreview.com.txt Update nationalreview.com.txt (#1106) 2023-06-19 14:17:50 +02:00
natura-sciences.com.txt
nature.com.txt Update nature.com.txt (#1567) 2025-03-11 15:30:49 +01:00
nbnnews.com.au.txt
ncbi.nlm.nih.gov.txt Update ncbi.nlm.nih.gov.txt (#1295) 2024-01-07 06:11:06 +01:00
nejm.org.txt Update nejm.org.txt (#1362) 2024-04-09 11:38:19 +02:00
nerdy.dev.txt Rename custom/nerdy.dev.txt to nerdy.dev.txt (#1684) 2025-06-25 23:20:47 +02:00
net-security.org.txt
netflixtechblog.com.txt Create netflixtechblog.com.txt (#1147) 2023-07-12 06:27:21 +02:00
netmagazine.com.txt
networkworld.com.txt Idg (#1786) 2025-11-12 16:15:45 +01:00
netzoekonom.de.txt
netzpolitik.org.txt Update netzpolitik.org.txt (#1825) 2025-12-19 14:08:55 +01:00
neues-deutschland.de.txt
neunetz.com.txt
newcriterion.com.txt Create newcriterion.com.txt 2020-12-01 14:07:28 +01:00
newmedia.calcalist.co.il.txt Create newmedia.calcalist.co.il.txt (#1083) 2023-06-02 13:22:32 +02:00
newrepublic.com.txt Update newrepublic.com.txt 2022-04-09 11:25:28 +02:00
news.bayern.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
news.cnet.com.txt
news.com.au.txt Update news.com.au.txt (#1510) 2024-12-01 09:10:11 +01:00
news.detik.com.txt
news.google.com.txt Update news.google.com.txt 2023-02-16 15:58:36 +01:00
news.infoseek.co.jp.txt add limo.media.txt and news.infoseek.co.jp.txt (#1863) 2026-02-02 20:03:59 +01:00
news.jp.txt add news.jp.txt and web.gekisaka.jp.txt (#1820) 2025-12-16 14:43:32 +01:00
news.mynavi.jp.txt
news.pixelistes.com.txt
news.rambler.ru.txt
news.rub.de.txt
news.techmeme.com.txt
news.yahoo.co.jp.txt add news.yahoo.co.jp.txt (#1817) 2025-12-14 09:53:45 +01:00
news.ycombinator.com.txt Update news.ycombinator.com.txt (#1276) 2023-12-22 07:11:42 +01:00
news247.gr.txt
newsbomb.gr.txt
newsinfo.inquirer.net.txt Update newsinfo.inquirer.net.txt 2024-06-07 12:31:54 +02:00
newsletter.pragmaticengineer.com.txt Create newsletter.pragmaticengineer.com.txt (#1174) 2023-07-25 16:48:09 +02:00
newsphere.jp.txt Update xenospectrum.com add newsphere.jp jp.reuters.com (#1847) 2026-01-11 16:34:29 +01:00
newstatesman.com.txt Update newstatesman.com.txt 2022-08-19 23:53:59 +02:00
newsunspun.org.txt
newsweek.com.txt Update newsweek.com.txt 2020-08-26 01:50:07 +02:00
newswise.com.txt
newtimesslo.com.txt Create newtimesslo.com.txt (#1592) 2025-04-17 22:39:29 +02:00
newyorkaktuell.nyc.txt Add newyorkaktuell.nyc.txt with metadata and URLs (#1907) 2026-03-03 00:40:43 +01:00
newyorker.com.txt Update newyorker.com.txt (#1249) 2023-11-17 20:47:20 +01:00
next.ink.txt next.ink: fix author (#1814) 2025-12-12 08:50:07 +01:00
nextcloud.com.txt
nextdraft.com.txt add nextdraft.com (#1101) 2023-06-19 09:07:15 +02:00
nextg.tv.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
nf-farn.de.txt Create nf-farn.de.txt 2022-01-16 12:58:32 +01:00
nfl.com.txt Create nfl.com.txt 2020-08-24 02:29:29 +02:00
nicj.net.txt Added nicj.net.txt (#826) 2020-10-27 09:30:45 +01:00
nifi.apache.org.txt Create nifi.apache.org.txt (#1482) 2024-11-05 09:01:36 +01:00
nikkei.com.txt add nikkei.com.txt (#1790) 2025-11-24 12:58:39 +01:00
nintendoworldreport.com.txt
nitter.net.txt Create nitter.net.txt (#1313) 2024-01-21 10:12:06 +01:00
nj.com.txt
noidea.dog.txt Create noidea.dog.txt (#1142) 2023-07-12 06:29:20 +02:00
nojesguiden.se.txt
nordmainische-s-bahn.de.txt Add files via upload (#1419) 2024-08-08 14:48:18 +02:00
northumberlandview.ca.txt
nos.nl.txt Create nos.nl (#1116) 2023-06-26 06:32:47 +02:00
nosalty.hu.txt
nota-bene.org.txt
notebookcheck.net.txt Update notebookcheck.net.txt (#1735) 2025-08-06 15:43:35 +02:00
notimx.mx.txt Create notimx.mx.txt 2024-02-12 16:29:32 -06:00
nouvelobs.com.txt Update nouvelobs.com.txt 2024-06-14 11:28:14 +02:00
novastan.org.txt
novinky.cz.txt
np-coburg.de.txt 9 new MHS-Digital sites (#1088) 2023-06-09 06:18:07 +02:00
nplusonemag.com.txt Update nplusonemag.com.txt 2022-07-14 22:20:35 -04:00
npr.org.txt Update npr.org.txt 2020-08-24 18:21:38 +02:00
nrc.nl.txt
nrz.de.txt Funke (#1373) 2024-04-30 21:37:22 +02:00
ntoskrnl.org.txt
number.bunshun.jp.txt add number.bunshun.jp.txt and taxlabor.com.txt (#1833) 2026-01-03 22:18:54 +01:00
numerama.com.txt Update numerama.com.txt 2025-09-11 11:38:25 +02:00
nybooks.com.txt Update nybooks.com.txt 2020-10-24 12:35:05 +02:00
nymag.com.txt fix: nymag.com.txt (#1810) 2025-12-08 17:05:58 +01:00
nyra.nyc.txt Add configuration for nyra.nyc article scraping (#1859) 2026-01-28 18:44:13 +01:00
nytimes.com.txt Update nytimes.com.txt 2025-07-24 15:22:39 +02:00
nzz.ch.txt Remove "Optimize your browser" text (#1558) 2025-02-26 06:53:44 +01:00
o6asan.com.txt
oberhessische-zeitung.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
observers.france24.com.txt
ocu.org.txt Update ocu.org (#815) 2020-10-02 15:25:25 +02:00
off.net.mk.txt
oko.press.txt Update oko.press.txt (#1585) 2025-04-11 08:11:20 +02:00
oktoberfest.bayern.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
oltnertagblatt.ch.txt Create oltnertagblatt.ch.txt (#1201) 2023-09-05 15:01:08 +02:00
omgubuntu.co.uk.txt
omiliya.org.txt
onb.ac.at.txt Update onb.ac.at.txt (#1708) 2025-07-12 15:13:20 +02:00
oncletom.io.txt
onlinewelten.com.txt
ontologicalgeek.com.txt
op-online.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
open.online.txt Create open.online.txt (#1122) 2023-07-03 09:27:41 +02:00
openai.com.txt Update openai.com.txt (#1431) 2024-09-15 15:57:16 +02:00
opendemocracy.net.txt Create opendemocracy.net.txt 2021-11-01 22:53:35 +01:00
opensource.com.txt add config for opensource.com (#1093) 2023-06-16 21:49:51 +02:00
opensource.org.txt
openstreetmap.org.txt add openstreetmap.org blog rules (#795) 2020-08-28 06:07:49 +02:00
openthemagazine.com.txt
optimizesmart.com.txt Create optimizesmart.com.txt 2022-08-20 10:30:54 +02:00
orf.at.txt
orientxxi.info.txt
origo.hu.txt
oschina.net.txt
osmand.net.txt
osmc.tv.txt
ostechnix.com.txt Create ostechnix.com.txt 2021-02-01 14:27:15 +01:00
ostprog.de.txt Update ostprog.de.txt 2022-10-01 23:53:28 +02:00
otz.de.txt Funke (#1373) 2024-04-30 21:37:22 +02:00
ourworldindata.org.txt
outsideonline.com.txt Update outsideonline.com.txt 2021-10-29 00:13:19 +02:00
ovb-online.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
overreacted.io.txt Update overreacted.io.txt 2025-10-11 14:06:05 +02:00
oxfordamerican.org.txt
paddle.com.txt Create paddle.com.txt 2024-05-16 16:11:14 +02:00
pagenotfound.cz.txt Update pagenotfound.cz.txt (#1601) 2025-04-29 13:05:38 +02:00
palmbeachpost.com.txt
pandemicequityinitiative.com.txt Shtrom 2024 01 (#1293) 2024-01-07 06:05:15 +01:00
pandodaily.com.txt
panic.com.txt
paperpaper.ru.txt Update paperpaper.ru.txt 2020-08-31 20:49:49 +02:00
papertohtml.org.txt Update papertohtml.org.txt 2024-12-26 13:19:48 +01:00
papodehomem.com.br.txt
paquier.xyz.txt
parislemon.com.txt
parliament.uk.txt
parlimen.gov.my.txt Create parlimen.gov.my.txt (#1687) 2025-06-27 16:52:36 +02:00
parool.nl.txt
pastebin.com.txt
pastepad.fivefilters.org.txt
pathawks.com.txt
patreon.com.txt Update patreon.com.txt (#1488) 2024-11-10 16:30:37 +01:00
pcgamer.com.txt Add cohost.org ko-fi.com and pcgamer.com (#1364) 2024-04-13 16:18:23 +02:00
pcmag.com.txt Update pcmag.com.txt 2023-03-29 23:24:28 +02:00
pcworld.com.txt Update pcworld.com.txt (#1681) 2025-06-24 17:20:41 +02:00
penny-arcade.com.txt
pentaxforums.com.txt
peoplesdispatch.org.txt Create peoplesdispatch.org.txt (#1418) 2024-08-08 14:29:29 +02:00
perell.com.txt
perspective-daily.de.txt fix perspective-daily title 2020-09-17 16:27:38 +02:00
pestemag.com.txt Create pestemag.com.txt 2022-10-16 14:51:20 +02:00
pfefferminzia.de.txt Create pfefferminzia.de.txt (#1068) 2023-03-19 20:47:56 +01:00
pflegen-online.de.txt Create pflegen-online.de.txt (#1268) 2023-12-13 13:58:52 +01:00
pharmazeutische-zeitung.de.txt Update pharmazeutische-zeitung.de.txt (#1150) 2023-07-12 06:26:29 +02:00
phastidio.net.txt
philosophyforlife.org.txt Create philosophyforlife.org.txt (#1092) 2023-06-14 16:27:55 +02:00
philosophynow.org.txt Create philosophynow.org.txt 2020-12-01 14:10:23 +01:00
philstar.com.txt
phoronix.com.txt Update phoronix.com (#1267) 2023-12-10 21:35:04 -08:00
photo.tutsplus.com.txt
photografix-magazin.de.txt Update photografix-magazin.de.txt (#1266) 2023-12-10 11:41:34 +01:00
photopills.com.txt Create photopills.com.txt (#1176) 2023-08-12 10:21:15 +02:00
phototrend.fr.txt
php.net.txt
phys.org.txt Update phys.org.txt (#1572) 2025-03-19 19:58:00 +01:00
pinterest.com.txt
piped.video.txt Create piped.video.txt (#1203) 2023-09-18 09:20:56 +02:00
pitchfork.com.txt Update pitchfork.com.txt 2022-12-05 21:28:02 +01:00
pittsburghmagazine.com.txt
pixellibre.net.txt
pjmedia.com.txt
placegrenet.fr.txt
planet3dnow.de.txt
planetvita.de.txt
playboy.com.txt
playgroupnsw.org.au.txt
ploum.net.txt fix: ploum.net (#1542) 2025-01-03 18:16:00 +01:00
pluralistic.net.txt Update pluralistic.net.txt (#1489) 2024-11-10 21:14:08 +01:00
plus.google.com.txt
plzkthxbai.com.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
pmf.silvrback.com.txt
poetryfoundation.org.txt Update poetryfoundation.org.txt 2022-04-18 10:59:31 +02:00
poets.org.txt Update scraping rules for poets.org (#1854) 2026-01-24 14:02:16 +01:00
pogue.blogs.nytimes.com.txt
politico.com.txt Update politico.com.txt (#1636) 2025-05-31 04:48:03 +02:00
politifact.com.txt
politiken.dk.txt
politis.fr.txt LPL (#1788) 2025-11-18 21:24:33 +01:00
polka.academy.txt Update polka.academy.txt (#1078) 2023-04-13 19:31:29 +02:00
polygon.com.txt Update polygon.com.txt (#1136) 2023-07-06 10:16:07 +02:00
popehat.com.txt
popsci.com.txt
popularmechanics.com.txt Popularmechanics (#1649) 2025-06-08 10:26:54 +02:00
portertech.ca.txt
positioningmag.com.txt
posta.com.tr.txt
posteo.de.txt
postnauka.ru.txt Update postnauka.ru.txt 2021-03-18 09:44:37 +01:00
preparedfoods.com.txt
president.jp.txt Add president.jp (#1794) 2025-11-29 11:34:45 +01:00
presse-citron.net.txt
presseportal.de.txt modified: presseportal.de (#962) 2022-03-27 14:12:27 +02:00
primaonline.it.txt Create primaonline.it.txt (#1651) 2025-06-08 12:49:46 +02:00
privacyinternational.org.txt
pro-linux.de.txt
prog21.dadgum.com.txt
prolost.com.txt
propakistani.pk.txt
propublica.org.txt
proskauer.com.txt
prospectmagazine.co.uk.txt
protocol.com.txt Add protocol.com.txt (#864) 2021-03-15 01:41:26 +01:00
protonmail.com.txt
protothema.gr.txt
psu.edu.txt Create psu.edu.txt (#919) 2022-01-03 10:02:13 +01:00
psyche.co.txt Update psyche.co.txt (#1639) 2025-06-01 01:37:32 +02:00
psychologytoday.com.txt
psypost.org.txt Create psypost.org.txt (#1631) 2025-05-28 10:01:26 +02:00
publications.aap.org.txt Shtrom 2024 03 (#1347) 2024-03-05 11:52:33 +01:00
publications.parliament.uk.txt
publicdomainreview.org.txt Update publicdomainreview.org.txt (#1683) 2025-06-25 14:34:47 +02:00
publico.pt.txt
puri.sm.txt Update puri.sm.txt (#1060) 2023-03-09 22:46:20 +01:00
putaindecode.io.txt
putsch.media.txt
pxlnv.com.txt
pymotw.com.txt
python.org.txt Create python.org.txt 2022-03-01 21:38:18 +01:00
qctimes.com.txt
qntm.org.txt Create qntm.org.txt (#1504) 2024-11-22 08:20:04 +01:00
quantamagazine.org.txt Update quantamagazine.org.txt (#1660) 2025-06-13 17:53:15 +02:00
quantumdiaries.org.txt
quechoisir.org.txt
queerty.com.txt
questionablecontent.net.txt
queue.acm.org.txt
quickanddirtytips.com.txt
quora.com.txt Update quora.com.txt 2020-11-15 20:44:02 +01:00
qz.com.txt Update qz.com.txt 2020-11-09 15:00:40 +01:00
rachelandrew.co.uk.txt
racjonalista.pl.txt
radar.oreilly.com.txt
radionz.co.nz.txt
radishzz.cc.txt Create radishzz.cc.txt (#1530) 2024-12-14 20:54:30 +01:00
rancher.com.txt
randsinrepose.com.txt
rasgolatente.es.txt
rbb24.de.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
reactjs.org.txt Update reactjs.org.txt 2022-08-20 10:37:03 +02:00
reactormag.com.txt Create reactormag.com.txt (#1760) 2025-09-11 10:59:02 +02:00
readingthechinadream.com.txt Create readingthechinadream.com.txt (#1187) 2023-08-20 12:54:45 +02:00
README.md Update README.md 2025-07-24 15:30:30 +02:00
real.gr.txt
rebelionenlagranja.com.txt Create rebelionenlagranja.com.txt (#933) 2022-02-17 20:22:47 +01:00
rebooti.com.txt
recode.net.txt
redalemeden.com.txt
redbull.com.txt
reddit.com.txt Update reddit.com.txt (#1691) 2025-06-30 16:19:20 +02:00
redeszone.net.txt
redmas.com.co.txt Create redmas.com.co.txt 2024-02-12 16:34:18 -06:00
redmondpie.com.txt
redtimmy.com.txt Config for blogposts at redtimmy.com (#796) 2020-09-02 10:44:13 +02:00
refinery29.com.txt Update refinery29.com.txt 2022-07-04 16:48:47 -04:00
reflets.info.txt LPL (#1788) 2025-11-18 21:24:33 +01:00
regionaltangente-west.de.txt Update regionaltangente-west.de.txt (#1564) 2025-03-06 21:41:00 +01:00
reitschuster.de.txt Create reitschuster.de.txt (#1319) 2024-01-26 19:49:12 +01:00
renenekuda.cz.txt
renverse.co.txt Create renverse.co.txt 2022-04-09 10:41:04 +02:00
report-k.de.txt Create report-k.de (#1190) 2023-08-27 10:53:55 +02:00
reportermagazin.cz.txt
reporterre.net.txt Update reporterre.net.txt 2023-10-08 12:02:48 +02:00
researchandmarkets.com.txt Create researchandmarkets.com.txt 2021-04-13 21:00:20 +02:00
researchgate.net.txt Update researchgate.net.txt (#1539) 2024-12-28 14:04:15 +01:00
resilience.org.txt Create resilience.org.txt (#918) 2021-12-13 21:16:32 +01:00
retractionwatch.com.txt
retro-games.fr.txt add retro-games.fr (#1102) 2023-06-19 09:07:34 +02:00
reuters.com.txt Revert reuters.com header changes 2025-10-03 18:10:09 +02:00
revdennismccarty.com.txt Create revdennismccarty.com.txt 2021-07-27 12:18:29 +02:00
reves-d-espace.com.txt adding reves-d-espace.com (#1876) 2026-02-21 12:22:04 +01:00
revue-farouest.fr.txt
rewe.de.txt Create rewe.de.txt (#1706) 2025-07-07 10:05:00 +02:00
rework.withgoogle.com.txt
rezeptwelt.de.txt
rfi.fr.txt Create rfi.fr.txt (#1602) 2025-04-29 13:30:54 +02:00
rhein-kreis-neuss.de.txt Update rhein-kreis-neuss.de.txt 2020-12-30 19:23:05 +01:00
richardkmorgan.com.txt Create richardkmorgan.com.txt (#1161) 2023-07-20 14:04:57 +02:00
riedbahn.de.txt Add files via upload (#1419) 2024-08-08 14:48:18 +02:00
riffreporter.de.txt Create riffreporter.de.txt 2022-02-17 22:08:47 +01:00
ritimo.org.txt Shtrom 2024 03 (#1347) 2024-03-05 11:52:33 +01:00
rnd.de.txt Add rnd.de.txt (#882) 2021-05-14 00:47:15 +02:00
roadandtrack.com.txt Create roadandtrack.com.txt (#1646) 2025-06-08 08:31:54 +02:00
robertsspaceindustries.com.txt
robots.thoughtbot.com.txt
rockpapershotgun.com.txt Improvements to eurogamer.net, heise.de, rockpapershotgun.com, tagesschau.de and zeit.de. Fix golem.de (#936) 2022-02-28 06:39:51 +01:00
rockylinux.org.txt Update rockylinux.org.txt (#1133) 2023-07-05 14:26:51 +02:00
rodrigo.sharpcube.com.txt
rogerebert.com.txt Update rogerebert.com.txt 2020-11-06 19:06:23 +01:00
rollingstone.com.txt
rom-game.fr.txt
romchip.org.txt Create romchip.org.txt (#1540) 2024-12-29 13:33:39 +01:00
roomescapeartist.com.txt
root.cz.txt
rosenheim24.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
rottentomatoes.com.txt
roughtype.com.txt
roy.gbiv.com.txt
royalsocietypublishing.org.txt Add royalsocietypublishing.org.txt (#878) 2021-05-10 13:40:01 +02:00
rpgsite.net.txt
rtbf.be.txt Add author and update body extraction rules (#1862) 2026-01-31 17:48:15 +01:00
rtings.com.txt Rtings.com (#1113) 2023-06-26 06:34:25 +02:00
rubysfera.pl.txt
rue89bordeaux.com.txt LPL (#1788) 2025-11-18 21:24:33 +01:00
rue89lyon.fr.txt LPL (#1788) 2025-11-18 21:24:33 +01:00
rue89strasbourg.com.txt LPL (#1788) 2025-11-18 21:24:33 +01:00
rugbyrama.fr.txt Create rugbyrama.fr.txt 2024-06-28 13:36:37 +02:00
ruhlman.com.txt
ruhr24.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
rums.ms.txt Rename rums.ms to rums.ms.txt 2020-11-23 11:50:52 +01:00
rust-lang-nursery.github.io.txt
s6-frankfurt-friedberg.de.txt Add files via upload (#1419) 2024-08-08 14:48:18 +02:00
saadaalnews.net.txt
sacbee.com.txt
salon.com.txt fix: salon.com body rule (#1869) 2026-02-18 16:07:20 +01:00
saltyworld.net.txt
salzburg.com.txt
san.com.txt Create san.com.txt (#1369) 2024-04-27 12:07:41 +02:00
sankei.com.txt add sankei.com (#1851) 2026-01-21 14:25:02 +01:00
sanpedrosun.com.txt
sapiens.org.txt Create sapiens.org.txt (#1517) 2024-12-05 14:07:03 +01:00
sargasso.nl.txt
sauerlandkurier.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
saveyourself.ca.txt
sayidaty.net.txt
sbnation.com.txt
scheuch.de.txt Create scheuch.de.txt (#1455) 2024-10-23 04:22:48 +02:00
schneier.com.txt
schwarzwaelder-bote.de.txt 9 new MHS-Digital sites (#1088) 2023-06-09 06:18:07 +02:00
science.org.txt Create science.org.txt 2021-10-05 13:47:43 +02:00
scienceblogs.de.txt
sciencedirect.com.txt Create sciencedirect.com.txt (#1308) 2024-01-15 16:41:00 +01:00
sciencepresse.qc.ca.txt Update sciencepresse.qc.ca.txt (#1724) 2025-07-26 08:53:25 +02:00
scienceticker.info.txt
scientificamerican.com.txt Update scientificamerican.com.txt 2020-10-17 13:29:47 +02:00
scilogs.de.txt
scinfolex.com.txt
scnsrc.me.txt
scotthelme.co.uk.txt Add scotthelme.co.uk.txt (#944) 2022-03-03 06:47:39 +01:00
scottohara.me.txt
scotusblog.com.txt
scripting.com.txt
scroll.in.txt Create scroll.in.txt 2021-01-15 14:40:13 +01:00
sdxcentral.com.txt
searchenginejournal.com.txt
searchengineland.com.txt
seattletimes.com.txt Fix getting full text, similar to nytimes.com (#1168) 2023-07-25 06:46:38 +02:00
seattletransitblog.com.txt
sebsauvage.net.txt Update sebsauvage.net.txt (#1143) 2023-07-12 06:26:57 +02:00
sec.gov.txt Create sec.gov.txt 2025-08-19 13:09:58 +02:00
secouchermoinsbete.fr.txt
secretmag.ru.txt
securelist.com.txt
securityaffairs.co.txt
securitylab.ru.txt Create securitylab.ru.txt (#1468) 2024-10-30 01:45:34 +01:00
secushare.org.txt
segment.com.txt
select.yeeyan.org.txt
semiaccurate.com.txt
sempredirebanzai.it.txt Create sempredirebanzai.it.txt (#1643) 2025-06-06 09:50:26 +02:00
senscritique.com.txt Create senscritique.com.txt (#1677) 2025-06-20 13:18:23 +02:00
seriouseats.com.txt Refactor extraction rules for seriouseats.com (#1778) 2025-10-19 19:37:03 +02:00
serpentinegalleries.org.txt Create serpentinegalleries.org.txt (#1433) 2024-09-21 02:58:46 +02:00
servethehome.com.txt
seznamzpravy.cz.txt Seznamzpravy.cz 2 (#1123) 2023-07-01 09:19:38 +02:00
sf.eater.com.txt
sfchronicle.com.txt Create sfchronicle.com.txt (#1653) 2025-06-08 14:27:29 +02:00
sfgate.com.txt
sfweekly.com.txt
shabayek.com.txt
shahinkalantari.com.txt Create shahinkalantari.com.txt (#1162) 2023-07-20 22:22:56 +02:00
share.ez.no.txt
shawnblanc.net.txt
shepherd.com.txt Create shepherd.com.txt (#1475) 2024-11-02 00:08:02 +01:00
shifteleven.com.txt
shipilev.net.txt
shs.cairn.info.txt Update and rename cairn.info.txt to shs.cairn.info.txt (#1754) 2025-08-30 09:55:27 +02:00
shueisha.online.txt add dailyshincho.jp.txt and shueisha.online.txt (#1824) 2025-12-19 13:36:11 +01:00
shz.de.txt
siecledigital.fr.txt Update stripping rules and test URL in siecledigital.fr.txt (#1852) 2026-01-22 16:37:38 +01:00
signal.org.txt Updated signal.org.txt (#833) 2020-12-15 20:39:18 +01:00
singaporeanstocksinvestor.blogspot.com.txt
singularityhub.com.txt
sivers.org.txt
slashdot.org.txt slashdot: replace i tags with blockquote (#929) 2022-02-15 07:16:41 +01:00
slashfilm.com.txt
slate.com.txt slate.com: improve ad stripping (#839) 2021-01-04 07:02:50 +01:00
slate.fr.txt Update slate.fr.txt 2024-05-28 10:36:08 +02:00
slice.seriouseats.com.txt
slog.thestranger.com.txt
slrlounge.com.txt Added slrlounge.com.txt (#963) 2022-03-31 20:53:40 +02:00
smarthomebeginner.com.txt
smashingmagazine.com.txt Update smashingmagazine.com.txt (#1084) 2023-05-16 06:38:14 +02:00
smbc-comics.com.txt
sme.sk.txt
smh.com.au.txt Update smh.com.au.txt (#1445) 2024-10-16 14:51:16 +02:00
smithsonianmag.com.txt fix: bad format errors (#1811) 2025-12-09 13:48:06 +01:00
snip.ly.txt
snob.ru.txt Update snob.ru.txt 2021-12-29 16:26:21 +01:00
soester-anzeiger.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
somethingawful.com.txt
songshuhui.net.txt
soundcity.tv.txt
soundonsound.com.txt Update soundonsound.com.txt 2021-07-23 19:26:08 +02:00
sourcebooks.com.txt
sowetanlive.co.za.txt
space.com.txt Update space.com.txt (#1744) 2025-08-12 07:07:28 +02:00
spacetoday.com.br.txt chore: add configuration file for spacetoday.com.br (#1746) 2025-08-14 03:44:57 +02:00
spacex.com.txt Add scraping instructions for spacex.com (#1781) 2025-11-01 07:28:52 +01:00
spectator.co.uk.txt Update spectator.co.uk.txt (#1666) 2025-06-14 07:37:10 +02:00
spectrejournal.com.txt Create spectrejournal.com.txt 2021-03-12 10:12:43 +01:00
spectrum.ieee.org.txt Update spectrum.ieee.org.txt (#1204) 2023-09-18 08:59:30 +02:00
spektrum.de.txt Update spektrum.de.txt (#1107) 2023-06-19 14:33:49 +02:00
spiderum.com.txt Update spiderum.com.txt 2022-08-27 12:26:19 +02:00
spiegel.de.txt Update spiegel.de.txt 2025-06-25 17:39:40 -04:00
spiked-online.com.txt
spin.com.txt
splinternews.com.txt
sport.detik.com.txt
sport365.fr.txt
sportiva.shueisha.co.jp.txt add sportiva.shueisha.co.jp (#1804) 2025-12-06 03:08:39 +01:00
sports.ru.txt Create sports.ru.txt (#1635) 2025-05-31 01:00:45 +02:00
sprengsatz.de.txt
sputniknews.com.txt
sqlite.org.txt
squashed.tumblr.com.txt
srf.ch.txt
stackoverflow.blog.txt
stackoverflow.com.txt Fixed stackoverflow.com.txt (#1090) 2023-06-13 10:19:48 +02:00
stadt-bremerhaven.de.txt Update stadt-bremerhaven.de.txt (#1063) 2023-03-16 15:50:19 +01:00
stadt-muenster.de.txt
stadtpost.de.txt Create stadtpost.de.txt (#1109) 2023-06-19 18:22:04 +02:00
staltz.com.txt
standard.co.uk.txt Update standard.co.uk.txt (#1591) 2025-04-17 22:37:53 +02:00
standardebooks.org.txt Add site config for standardebooks.org; update cbsnews.com (#1834) 2026-01-04 08:16:26 +01:00
standblog.org.txt Update XPath queries and add string replacements (#1879) 2026-02-23 14:16:43 +01:00
star-telegram.com.txt
statista.com.txt Create statista.com / es.statista.com (#852) 2021-01-20 13:51:38 +01:00
steamcommunity.com.txt Update steamcommunity.com.txt (#1562) 2025-02-28 05:35:40 +01:00
stefanjudis.com.txt
stephenfry.com.txt
stiftung-gegm.de.txt Create stiftung-gegm.de.txt (#1702) 2025-07-04 10:44:50 +02:00
stjv.fr.txt
stockholmsfria.se.txt
stopgame.ru.txt Create stopgame.ru.txt 2021-02-10 14:22:35 +01:00
straightdope.com.txt
straitstimes.com.txt Create straitstimes.com.txt (#1312) 2024-01-21 09:29:34 +01:00
stratfor.com.txt
stratobuilds.com.txt Add configuration for scraping stratobuilds.com (#1905) 2026-03-02 22:53:05 +01:00
streetsblog.net.txt
stuff.co.nz.txt
stumbleupon.com.txt
stuttgarter-nachrichten.de.txt 9 new MHS-Digital sites (#1088) 2023-06-09 06:18:07 +02:00
stuttgarter-zeitung.de.txt 9 new MHS-Digital sites (#1088) 2023-06-09 06:18:07 +02:00
substack.com.txt Subst2 (#1713) 2025-07-15 10:07:06 +02:00
subtraction.com.txt
sueddeutsche.de.txt
sukusuku.tokyo-np.co.jp.txt add sukusuku.tokyo-np.co.jp.txt (#1864) 2026-02-03 14:53:21 +01:00
sulek.fr.txt Create sulek.fr.txt (#1456) 2024-10-24 15:15:10 +02:00
summitroute.com.txt
sun-connect.org.txt Create sun-connect.org.txt (#1043) 2023-02-06 07:05:02 +01:00
sunshinecoastdaily.com.au.txt
supchina.com.txt Create .supchina.com.txt (#896) 2021-08-17 16:53:32 +02:00
superuser.openinfra.dev.txt Create superuser.openinfra.dev (#1457) 2024-10-24 22:39:37 +02:00
svd.se.txt
svt.se.txt Update svt.se.txt 2024-05-23 14:43:05 +02:00
swcarpentry.github.io.txt Create swcarpentry.github.io.txt 2022-06-13 00:41:57 +02:00
swissinfo.ch.txt Update swissinfo.ch.txt 2024-12-06 20:06:10 +01:00
switchonpaper.com.txt
sydsvenskan.se.txt
symmetrymagazine.org.txt
symphozik.info.txt Fix w3349 (#1460) 2024-10-26 22:05:39 +02:00
synbioz.com.txt
syncfusion.com.txt Create syncfusion.com.txt (#1669) 2025-06-14 16:38:13 +02:00
sz-magazin.sueddeutsche.de.txt
t-online.de.txt Create t-online.de.txt (#1409) 2024-07-29 01:20:37 +02:00
t3n.de.txt t3n.de: Fix cookie consent (#998) 2022-10-13 10:08:04 +02:00
t3terminal.com.txt Update t3terminal.com.txt 2020-11-10 12:45:56 +01:00
tabletmag.com.txt Update tabletmag.com.txt 2024-12-27 12:19:04 +01:00
tagblatt.de.txt
tagesanzeiger.ch.txt tagesanzeiger.ch.txt completely rewritten and replaced (#1021) 2022-12-27 13:31:15 +01:00
tagesschau.de.txt fix: bad test_contains directives (#1874) 2026-02-20 18:09:13 +01:00
tagesspiegel.de.txt Update tagesspiegel.de.txt (#1180) 2023-08-12 10:18:21 +02:00
tailscale.com.txt Adam (#1667) 2025-06-14 08:59:23 +02:00
takt-magazin.de.txt
taste.com.au.txt
tasteofhome.com.txt
taxacc.jp.txt add gendai.media.txt xenospectrum.com.txt taxacc.jp.txt (#1821) 2025-12-17 05:44:08 +01:00
taxlabor.com.txt add number.bunshun.jp.txt and taxlabor.com.txt (#1833) 2026-01-03 22:18:54 +01:00
taz.de.txt Modify extraction rules and add test URLs (#1812) 2025-12-11 14:20:48 +01:00
tbray.org.txt
teamliquid.net.txt
tech.sina.com.cn.txt
techcommunity.microsoft.com.txt
techcrunch.com.txt
techdirt.com.txt
techhive.com.txt
techmeme.com.txt
techno-science.net.txt
technologizer.com.txt
technologyreview.com.txt Create technologyreview.com.txt 2020-09-11 22:35:31 +02:00
techpinions.com.txt
techradar.com.txt Update techradar.com.txt 2023-04-02 00:39:57 +02:00
techstage.de.txt Update techstage.de.txt (#1396) 2024-06-18 03:50:17 +02:00
ted.com.txt
telegraph.co.uk.txt Update telegraph.co.uk.txt (#1547) 2025-01-10 14:32:47 +01:00
telepolis.de.txt Remove unnecessary comments and update XPath selectors (#1882) 2026-02-23 15:18:23 +01:00
telerama.fr.txt Add login settings for telerama.fr (#1010) 2022-11-08 17:06:05 +01:00
tennis.com.txt Update tennis.com.txt 2023-03-02 10:56:58 +01:00
terrestres.org.txt
texasmonthly.com.txt Update texasmonthly.com.txt 2021-08-23 14:58:58 +02:00
the-magazine.org.txt
the-scientist.com.txt
the-tls.co.uk.txt Create the-tls.co.uk.txt 2020-08-24 01:38:24 +02:00
theage.com.au.txt
theamericanscholar.org.txt
theathletic.com.txt
theatlantic.com.txt fix: bad format errors (#1811) 2025-12-09 13:48:06 +01:00
theatlanticcities.com.txt
thebaffler.com.txt Create thebaffler.com.txt 2023-01-17 22:13:52 +01:00
theblueprint.ru.txt Update theblueprint.ru.txt 2022-05-26 13:38:54 +02:00
thebulletin.org.txt Update thebulletin.org.txt 2023-07-22 13:16:24 +02:00
thecitypaperbogota.com.txt Create thecitypaperbogota.com.txt (#1498) 2024-11-17 07:06:15 +01:00
thecode.media.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
thecounter.org.txt Update thecounter.org.txt 2022-01-25 21:37:15 +01:00
thecreativeindependent.com.txt Create thecreativeindependent.com.txt 2021-01-06 16:20:03 +01:00
thecut.com.txt Update thecut.com.txt (#1047) 2023-02-13 07:56:15 +01:00
thedailybeast.com.txt
thedailymash.co.uk.txt
thedisneyblog.com.txt
thedrive.com.txt Update thedrive.com.txt 2021-03-13 16:36:11 +01:00
thefader.com.txt Update thefader.com.txt 2022-01-25 21:34:14 +01:00
thefilmexperience.net.txt
theflaw.org.txt Create theflaw.org.txt (#1581) 2025-04-01 02:03:48 +02:00
thegamedesignforum.com.txt
thegap.at.txt
theglobalmail.org.txt
thegreatdiscontent.com.txt
theguardian.com.txt Fix for theguardian (#1711) 2025-07-14 20:10:37 +02:00
thehansindia.com.txt Update thehansindia.com.txt 2021-10-29 01:04:44 +02:00
thehindu.com.txt Update thehindu.com.txt 2022-06-06 15:52:36 +02:00
theins.ru.txt Update theins.ru.txt 2022-04-09 11:06:35 +02:00
theintercept.com.txt
theinventory.com.txt The kinja sites updated their engine and now they tag their body content using "js_post-content" instead of just "post-content" (#917) 2021-11-29 19:45:01 +01:00
thekitchn.com.txt Create thekitchn.com.txt (#1112) 2023-06-26 06:34:41 +02:00
them.us.txt Adds config for them.us (#1087) 2023-06-05 09:18:16 +02:00
themarker.com.txt Update themarker.com.txt (#1082) 2023-05-12 19:37:35 +02:00
themillions.com.txt
thenation.com.txt
thenetworkgarden.blogs.com.txt
thenewatlantis.com.txt add site config for thenewatlantis (#830) 2020-11-27 09:24:04 +01:00
thenewdaily.com.au.txt Thenewdaily.com.au (#1321) 2024-01-27 05:54:13 +01:00
thenews.coop.txt
thenewstribune.com.txt
thenextgeneration.org.txt
thenextweb.com.txt
theoaklandpress.com.txt
theodinproject.com.txt Create theodinproject.com.txt 2022-06-13 00:44:40 +02:00
theonion.com.txt The kinja sites updated their engine and now they tag their body content using "js_post-content" instead of just "post-content" (#917) 2021-11-29 19:45:01 +01:00
theoutline.com.txt
theplayerstribune.com.txt Create theplayerstribune.com.txt (#1172) 2023-07-25 06:43:56 +02:00
thepointmag.com.txt
theregister.co.uk.txt theregister: improve support of quotes (#886) 2021-05-22 22:13:16 +02:00
theregister.com.txt theregister: improve support of quotes (#886) 2021-05-22 22:13:16 +02:00
theringer.com.txt Update theringer.com.txt 2020-11-04 11:35:38 +01:00
theroot.com.txt The kinja sites updated their engine and now they tag their body content using "js_post-content" instead of just "post-content" (#917) 2021-11-29 19:45:01 +01:00
therumpus.net.txt
thesaturdaypaper.com.au.txt
theses.enc.sorbonne.fr.txt
thesimpledollar.com.txt
theskepticalcardiologist.com.txt Create theskepticalcardiologist.com.txt (#1638) 2025-05-31 15:10:27 +02:00
thesocialitefamily.com.txt
thespoof.com.txt
thestranger.com.txt
thesun.co.uk.txt Create thesun.co.uk.txt 2020-10-13 13:32:21 +02:00
thetakeout.com.txt The kinja sites updated their engine and now they tag their body content using "js_post-content" instead of just "post-content" (#917) 2021-11-29 19:45:01 +01:00
theteaspot.com.txt Create theteaspot.com.txt 2021-02-14 00:10:46 +01:00
thethaovanhoa.vn.txt
thetimes.co.uk.txt Thetimes (#1194) 2023-08-29 16:20:41 +02:00
thetorah.com.txt Update thetorah.com.txt 2020-08-26 02:37:28 +02:00
theverge.com.txt Update theverge.com.txt (#1741) 2025-08-10 04:51:53 +02:00
theweek.com.txt
thewirecutter.com.txt
thingiverse.com.txt
thinkspot.com.txt Create thinkspot.com.txt 2020-08-31 23:57:55 +02:00
thinkwithgoogle.com.txt Update thinkwithgoogle.com.txt (#1124) 2023-07-03 09:28:08 +02:00
thisamericanlife.org.txt
thisiscolossal.com.txt
thoughtco.com.txt
threadreaderapp.com.txt Update threadreaderapp.com.txt 2021-05-04 21:40:11 +02:00
threatpost.com.txt
thrillist.com.txt
thueringer-allgemeine.de.txt Funke (#1373) 2024-04-30 21:37:22 +02:00
ticket.interpark.com.txt Create ticket.interpark.com.txt (#1077) 2023-04-13 09:33:46 +02:00
tidbits.com.txt
tijd.be.txt
time.com.txt Update time.com.txt 2022-05-06 15:24:11 +02:00
timeshighereducation.co.uk.txt
timeshighereducation.com.txt
timesofisrael.com.txt Create timesofisrael.com.txt 2025-06-24 15:30:04 -04:00
tipb.com.txt
titanic-magazin.de.txt
tldp.org.txt
tlz.de.txt Funke (#1373) 2024-04-30 21:37:22 +02:00
tnr.com.txt
tobias-hartmann.net.txt
tofugu.com.txt
tokyo-np.co.jp.txt add tokyo-np.co.jp.txt and businessinsider.jp.txt (#1823) 2025-12-18 13:42:33 +01:00
tomdispatch.com.txt
tomsguide.com.txt Update tomsguide.com.txt (#1661) 2025-06-14 06:38:02 +02:00
tomshardware.com.txt Update tomshardware.com.txt 2023-04-02 00:39:37 +02:00
tomshardware.de.txt
toolinux.com.txt
toolsandtoys.net.txt
topnews.jp.txt add finance.yahoo.co.jp.txt and topnews.jp.txt (#1856) 2026-01-26 19:21:42 +01:00
torgranate.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
torn.com.txt Torn (#1618) 2025-05-13 20:33:21 +02:00
torontolife.com.txt Update torontolife.com.txt 2022-12-20 00:23:38 +01:00
touilleur-express.fr.txt Add touilleur-express.fr.txt (#952) 2022-03-10 07:40:02 +01:00
tourmag.com.txt
touteduc.fr.txt
towardsdatascience.com.txt Create towardsdatascience.com.txt (#1148) 2023-07-12 06:25:30 +02:00
towerofthehand.com.txt
toyokeizai.net.txt Add replace(h2) and use strip id or class (#1828) 2025-12-22 09:34:12 +01:00
tracks.ranea.org.txt
tradingforaliving.pl.txt Create tradingforaliving.pl.txt (#1476) 2024-11-03 15:12:49 +01:00
trailer.web-view.net.txt
trailers.apple.com.txt
trailerzone.de.txt
traningslara.se.txt
trendmicro.com.txt Trendmicro (#1348) 2024-03-06 13:00:22 +01:00
triblive.com.txt
triple-c.at.txt Create triple-c.at.txt (#1378) 2024-05-15 16:24:53 +02:00
triplebyte.com.txt
trouw.nl.txt
troyhunt.com.txt
trustedreviews.com.txt
truthdig.com.txt
truthout.org.txt Update truthout.org.txt 2021-04-04 16:25:28 +02:00
tthfanfic.org.txt
tuaw.com.txt
tuhdo.github.io.txt
turnoff.us.txt
tvline.com.txt Update tvline.com.txt (#1091) 2023-06-14 09:03:14 +02:00
tvtropes.org.txt
tweakers.net.txt
twitter.com.txt twitter.com: fix content fetching using custom UA (#837) 2020-12-28 18:22:14 +01:00
twog.fr.txt Update twog.fr.txt 2021-04-30 15:59:42 +02:00
typo3.com.txt Update typo3.com.txt 2021-03-29 13:18:42 +02:00
typo3.org.txt Update typo3.org.txt 2021-03-29 13:21:33 +02:00
tz.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
ubuntugeek.com.txt
udn.com.txt Update udn.com.txt (#1402) 2024-07-09 16:39:19 +02:00
uefa.com.txt
ufu.de.txt Create ufu.de.txt (#1701) 2025-07-04 10:15:36 +02:00
uk.xbox360.ign.com.txt
uncannymagazine.com.txt Create uncannymagazine.com.txt (#1565) 2025-03-10 18:22:49 +01:00
unherd.com.txt Update unherd.com.txt 2023-10-28 09:49:05 +02:00
uni-watch.com.txt
universe.shelfd.com.txt Create universe.shelfd.com.txt (#1454) 2024-10-23 01:02:59 +02:00
unsertirol24.com.txt
unwinnable.com.txt
uol.com.br.txt Create uol.com.br.txt (#1583) 2025-04-06 09:55:33 +02:00
urbandictionary.com.txt
us-cert.gov.txt
usatoday.com.txt
usbeketrica.com.txt Update usbeketrica.com.txt (#1307) 2024-01-14 20:01:11 +01:00
useit.com.txt
usenix.org.txt add usenix.org (#1104) 2023-06-19 09:08:05 +02:00
utcc.utoronto.ca.txt chore: add body and date rules for utcc utoronto (#1893) 2026-02-24 15:06:47 +01:00
utdailybeacon.com.txt
utiliser-lightroom.com.txt
utux.fr.txt
ux.artu.tv.txt
uxdesign.cc.txt create 3 new configs (#1149) 2023-07-12 06:28:40 +02:00
vakarm.net.txt Added vakarm.net.txt (#809) 2020-09-20 22:43:14 +02:00
valdaiclub.com.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
vanityfair.com.txt Update vanityfair.com.txt 2022-02-24 04:15:56 +01:00
variety.com.txt
varsity.co.uk.txt
vc.ru.txt Create vc.ru.txt 2021-02-01 13:40:31 +01:00
vedonlyonti.com.txt Create vedonlyonti.com.txt 2024-05-07 23:21:33 +02:00
velomotion.de.txt
venturebeat.com.txt
verlagshaus-jaumann.de.txt 9 new MHS-Digital sites (#1088) 2023-06-09 06:18:07 +02:00
version2.dk.txt
verybestbaking.com.txt
vg.no.txt
viaoccitanie.tv.txt
vice.com.txt Update vice.com.txt 2022-05-12 14:15:23 +01:00
videogameschronicle.com.txt Create videogameschronicle.com.txt (#1717) 2025-07-19 08:55:46 +02:00
videogum.com.txt
vienna.at.txt Update vienna.at.txt (#1397) 2024-06-21 17:59:06 +02:00
viget.com.txt
villagevoice.com.txt Create villagevoice.com.txt 2020-10-13 16:28:12 +02:00
vimeo.com.txt
vincent.jousse.org.txt fix: index page parsing for https://vincent.jousse.org (#1544) 2025-01-08 17:35:18 +01:00
viply.de.txt
virten.net.txt Create virten.net.txt (#801) 2020-09-11 05:53:19 +02:00
visir.is.txt
visual-planning.com.txt Update visual-planning.com.txt 2024-12-02 13:53:15 +01:00
visualcapitalist.com.txt Add visualcapitalist.com (#965) 2022-04-13 14:41:55 +02:00
vitispr.com.txt
vivirmexico.com.txt
vk.com.txt Create vk.com.txt 2020-08-26 12:46:13 +02:00
vogue.co.uk.txt Add vogue.com and vogue.co.uk (#1480) 2024-11-04 00:37:31 +01:00
vogue.com.txt Add vogue.com and vogue.co.uk (#1480) 2024-11-04 00:37:31 +01:00
voices.washingtonpost.com.txt
voidstern.net.txt Create voidstern.net.txt (#1546) 2025-01-08 23:00:39 +01:00
volksfest-freising.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
volkskrant.nl.txt
voltairenet.org.txt
vot-tak.tv.txt Update vot-tak.tv.txt 2021-08-23 14:23:32 +02:00
vox.com.txt vox.com.txt: restore h3 and strip related section (#939) 2022-02-28 06:32:19 +01:00
voxeurop.eu.txt
vozpopuli.com.txt
vr-zone.com.txt
vrt.be.txt Update vrt.be.txt 2021-09-19 10:15:43 +02:00
vulture.com.txt Update vulture.com.txt (#1619) 2025-05-13 21:01:27 +02:00
w3.org.txt
wa.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
wallabag.org.txt Create wallabag.org.txt (#1479) 2024-11-03 18:28:55 +01:00
warnerbros.fr.txt
warriordudimanche.net.txt
washingtoninstitute.org.txt
washingtonmonthly.com.txt
washingtonpost.com.txt Update washingtonpost.com.txt 2024-07-17 15:50:39 +02:00
wasserburg24.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
watchlist-internet.at.txt Update watchlist-internet.at.txt (#1413) 2024-07-30 17:24:32 +02:00
watoday.com.au.txt
watson.ch.txt Update watson.ch.txt 2024-07-25 15:03:15 +02:00
watson.de.txt Update watson.de.txt 2024-07-25 15:03:33 +02:00
waz.de.txt Funke (#1373) 2024-04-30 21:37:22 +02:00
web-libre.org.txt
web.dev.txt Added web.dev.txt (#808) 2020-09-20 11:57:49 +02:00
web.gekisaka.jp.txt add news.jp.txt and web.gekisaka.jp.txt (#1820) 2025-12-16 14:43:32 +01:00
web.motormagazine.co.jp.txt Add replace(h2) and use strip id or class (#1828) 2025-12-22 09:34:12 +01:00
webcg.net.txt add 3 files (#1829) 2025-12-23 11:30:11 +01:00
weblogs.asp.net.txt
webupd8.org.txt
wellcome.org.txt Update wellcome.org.txt 2022-09-27 00:58:52 +02:00
wellcomecollection.org.txt Update wellcomecollection.org.txt 2022-09-27 01:05:07 +02:00
welt.de.txt
wenow.com.txt Create wenow.com.txt (#1053) 2023-02-14 20:58:37 +01:00
werra-rundschau.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
westernadvocate.com.au.txt
wetterauer-zeitung.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
what-if.xkcd.com.txt
whatever.scalzi.com.txt
wienerzeitung.at.txt Update wienerzeitung.at.txt (#1420) 2024-08-18 11:06:29 +02:00
wiesbadener-kurier.de.txt Update wiesbadener-kurier.de.txt 2023-10-15 15:01:12 +02:00
wiesn.bayern.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
wiki.guildwars.com.txt
wiki.guildwars2.com.txt
wikihow.com.txt
wikitravel.org.txt
wikiwand.com.txt Create wikiwand.com.txt 2021-09-11 18:24:45 +02:00
will-self.com.txt
winfuture.de.txt
wired.co.uk.txt add wired.co.uk (#1098) 2023-06-19 09:06:17 +02:00
wired.com.txt Update wired.com.txt (#1750) 2025-08-25 18:29:31 +02:00
wired.jp.txt Update wired.jp.txt 2022-03-15 00:49:46 +01:00
wiwo.de.txt fix: invalid XPath 1 expressions (#1805) 2025-12-05 16:58:16 +01:00
wlz-online.de.txt Updated 57 and added 5 new domains of Ippen Group (#1787) 2025-11-13 16:05:45 +01:00
wmpoweruser.com.txt
wn.de.txt
wochenanzeiger.de.txt add wochenanzeiger.de.txt (#899) 2021-08-17 16:52:05 +02:00
woman.tvbs.com.tw.txt Update woman.tvbs.com.tw.txt (#1273) 2023-12-19 13:51:21 +01:00
woolworths.com.au.txt
wordpress.org.txt
wordswithoutborders.org.txt Create wordswithoutborders.org.txt (#1575) 2025-03-27 20:07:19 +01:00
wordyard.com.txt
world.hey.com.txt Shtrom 2023 05 (#1085) 2023-05-17 09:51:47 +02:00
worldcrunch.com.txt
worldpoultry.net.txt
worldwidewords.org.txt
wormser-zeitung.de.txt Create wormser-zeitung.de.txt 2023-10-15 14:58:52 +02:00
wornandwound.com.txt Update wornandwound.com.txt 2022-03-11 11:07:16 +01:00
woshub.com.txt Create woshub.com.txt (#871) 2021-04-06 18:00:08 +02:00
wow.joystiq.com.txt
wp.de.txt Funke (#1373) 2024-04-30 21:37:22 +02:00
wpbeginner.com.txt Create wpbeginner.com.txt 2022-03-11 11:29:27 +01:00
wphive.com.txt Create wphive.com.txt 2022-03-13 10:17:40 +01:00
wpmayor.com.txt
wr.de.txt Funke (#1373) 2024-04-30 21:37:22 +02:00
writerunboxed.com.txt Create writerunboxed.com.txt (#1165) 2023-07-21 06:25:18 +02:00
wsj.com.txt Avoid stripping images on wsj.com (#1830) 2025-12-23 14:43:17 +01:00
wsws.org.txt Added World Socialist WebSite (wsws.org). (#822) 2020-10-16 10:26:07 +02:00
www.blueapron.com.txt
www.seriouseats.com.txt
www1.folha.uol.com.br.txt
www2.cnrs.fr.txt
wyborcza.biz.txt Wyborcza (#1379) 2024-05-21 15:02:38 +02:00
wyborcza.pl.txt Wyborcza (#1379) 2024-05-21 15:02:38 +02:00
wysokieobcasy.pl.txt Update wysokieobcasy.pl.txt (#1380) 2024-05-21 15:47:24 +02:00
wz-newsline.de.txt
xataka.com.txt Create xataka.com.txt (#1625) 2025-05-26 10:44:51 +02:00
xatakaciencia.com.txt
xatakamovil.com.txt Add xatakamovil.com (#1004) 2022-11-02 09:58:01 +01:00
xda-developers.com.txt Update xda-developers.com.txt 2022-10-25 10:17:42 +02:00
xenospectrum.com.txt Update xenospectrum.com add newsphere.jp jp.reuters.com (#1847) 2026-01-11 16:34:29 +01:00
xlsemanal.com.txt
xm.com.txt Update xm.com.txt 2023-09-28 14:44:58 +02:00
xn--protin-bva.com.txt
xplanereviews.com.txt Create xplanereviews.com.txt (#1652) 2025-06-08 13:13:40 +02:00
yahoo.com.txt Changed 3 Yahoo configs (#1400) 2024-07-07 10:12:10 +02:00
ycombinator.com.txt Create ycombinator.com.txt (#1497) 2024-11-17 06:39:52 +01:00
ynet.co.il.txt
yosoy.red.txt Create yosoy.red.txt 2021-03-11 21:37:43 +01:00
yostivanich.com.txt
yourerie.com.txt
youtu.be.txt Create youtu.be.txt (#1668) 2025-06-14 12:49:48 +02:00
youtube.com.txt Update youtube.com.txt (#1608) 2025-05-08 13:37:07 +02:00
zabbix.com.txt Create zabbix.com.txt (#1748) 2025-08-17 12:50:00 +02:00
zaknrw.de.txt
zakzak.co.jp.txt add below site configs (#1849) 2026-01-17 12:51:20 +01:00
zataz.com.txt
zdf.de.txt Update zdf.de.txt (#1367) 2024-04-24 17:13:13 +02:00
zdnet.com.txt
zdnet.fr.txt Create zdnet.fr.txt (#1678) 2025-06-22 19:54:13 +02:00
zdopravy.cz.txt
ze.tt.txt
zeit.de.txt fix zeit.de pagination (#1650) 2025-06-08 12:23:38 +02:00
zerohedge.com.txt
zerokspot.com.txt
zetland.dk.txt Create zetland.dk.txt 2020-11-25 23:54:11 +01:00
zhihu.com.txt Update zhihu.com.txt 2021-02-03 13:14:38 +01:00
zhuanlan.zhihu.com.txt Update zhuanlan.zhihu.com.txt (#902) 2021-08-17 16:50:01 +02:00
zinio.com.txt
zive.cz.txt
zoomit.ir.txt
zwiftinsider.com.txt Create zwiftinsider.com.txt (#1570) 2025-03-15 14:48:33 +01:00

Full-Text RSS site config files

Full-Text RSS, our article extraction tool, makes use of site-specific extraction rules to improve results. Each time a URL is processed, it checks to see if there are extraction rules for the site being processed. If there are no rules are found, it tries to detect the content block automatically.

This repository contains the site-specific extraction rules we rely on in Full-Text RSS.

Contributing changes

We run automated tests on these files to detect issues. If you'd like to help keep these up to date, please look at the test results and see which files you'd like to contribute fixes for.

We chose GitHub for this set of files because they offer one feature which we hope will make contributing changes easier: file editing through the web interface.

You can now make changes to any of our site config files and request that your changes be pulled into the main set we maintain. When we receive a pull request we'll review the changes and if everything's okay we'll update our copy.

If a site is not in our set, you can create a file for it in the same way. See Creating files on GitHub.

How to write a site config file

The quickest and simplest way is to use our point-and-click interface. It's a simple tool only intended to create a rule to extract the correct content block.

For further refinements, e.g. selecting the title, stripping elements, dealing with multi-page articles, please see our help page.

File naming

Use example.com.txt for

  • www.example.com
  • example.com

Use .example.com.txt for

  • sport.example.com
  • news.example.com
  • environment.example.com
  • etc.

Use sport.example.com.txt to target just that sub-domain:

  • sport.example.com

Note: .example.com.txt will not match www.example.com or example.com

Instapaper

When we introduced site patterns, we chose to adopt the same format used by Instapaper. This allowed us to make use of the extraction rules contributed by Instapaper users.

Marco, Instapaper's creator, graciously opened up the database of contributions to everyone:

And, recognizing that your efforts could be useful to a wide range of other tools and services, I'll make the list of all of these site-specific configurations available to the public, free, with no strings attached.

You can see the list maintained by Instapaper at instapaper.com/bodytext/ (no longer available since Instapaper was sold).

Testing site config files

Currently you will have to have a copy of Full-Text RSS to test changes to the site config files. In the future we will try to make this process easier.