Joshua Potter
8d7f1e7c4a
Scrape content into an asynchronous pipeline. ( #8 )
2023-12-05 11:43:13 -07:00
Joshua Potter
63764a22c4
Transition to a CSV; Postgres can handle that better.
2023-12-04 15:08:17 -07:00
Joshua Potter
1c0dc05b42
Separate initialization from loading. Prefer upserts.
2023-12-04 08:14:33 -07:00
Joshua Potter
9b9e561e49
Apply pyls-isort.
2023-12-01 16:37:05 -07:00
Joshua Potter
a4b1647e53
Allow specifying multiple sites in command line.
2023-12-01 16:36:22 -07:00
Joshua Potter
0c4e008b45
Rewrite export as NDJSON and include script to load result into postgres. ( #3 )
...
* Allow loading exported data into database.
* Explanation on E2E.
2023-12-01 10:30:44 -07:00
Joshua Potter
bc2ffeae9d
Add a scraper for lichess. ( #2 )
2023-11-30 15:36:44 -07:00
Joshua Potter
10801b560c
Generalize in anticipation of merging the lichess scraper. ( #1 )
...
* Add a general `Scraper` class.
* Setup main as primary entrypoint.
* Abstract original scraper into scraper class.
* Add better logging and cleaner bash commands.
* Ensure exporting works.
2023-11-30 15:15:15 -07:00
Joshua Potter
fe2e504de9
Package into app for `nix build`.
2023-11-28 05:53:09 -07:00
Joshua Potter
99c89a3a6d
Restructure and add documentation. Require specifying user-agent.
2023-11-27 20:06:42 -07:00
Joshua Potter
1a4199e95d
Add README descriptions.
2023-11-27 14:13:46 -07:00
Joshua Potter
1710e1aefa
Initial commit.
2023-11-27 13:09:40 -07:00