Commit Graph

21 Commits (8f21cb64afadaa582f387b29efcff1f316cf326f)

Author SHA1 Message Date
Joshua Potter 8f21cb64af Add random position. 2023-12-07 08:08:53 -07:00
Joshua Potter 44a18fc59c Add language detection for chesscom profiles. 2023-12-07 05:12:13 -07:00
Joshua Potter f2fd289225 Remove activity download. 2023-12-06 20:11:54 -07:00
Joshua Potter 47e8d245c3 Scrape titles. 2023-12-06 19:52:40 -07:00
Joshua Potter 0b9a721368 Maintain order on languages. 2023-12-05 16:06:04 -07:00
Joshua Potter f20fc76081 Load languages into the database. 2023-12-05 15:15:42 -07:00
Joshua Potter ef5d296097
Scrape languages from lichess listing. (#10) 2023-12-05 14:20:46 -07:00
Joshua Potter 82dbef21b6
Fix all mypy warnings. (#9) 2023-12-05 12:54:12 -07:00
Joshua Potter 8d7f1e7c4a
Scrape content into an asynchronous pipeline. (#8) 2023-12-05 11:43:13 -07:00
Joshua Potter 63764a22c4 Transition to a CSV; Postgres can handle that better. 2023-12-04 15:08:17 -07:00
Joshua Potter 9b9e561e49 Apply pyls-isort. 2023-12-01 16:37:05 -07:00
Joshua Potter a4b1647e53 Allow specifying multiple sites in command line. 2023-12-01 16:36:22 -07:00
Joshua Potter 0c4e008b45
Rewrite export as NDJSON and include script to load result into postgres. (#3)
* Allow loading exported data into database.

* Explanation on E2E.
2023-12-01 10:30:44 -07:00
Joshua Potter 9b81105a5e Use lxml to speed up parsing. 2023-12-01 07:12:40 -07:00
Joshua Potter d549e5f5eb Export blitz and bullet ratings. 2023-12-01 07:10:58 -07:00
Joshua Potter 36d471e395 Export rapid ratings. 2023-11-30 20:35:20 -07:00
Joshua Potter e050d13aa7 Add class for wrapping around exports. 2023-11-30 17:30:28 -07:00
Joshua Potter bc2ffeae9d
Add a scraper for lichess. (#2) 2023-11-30 15:36:44 -07:00
Joshua Potter 10801b560c
Generalize in anticipation of merging the lichess scraper. (#1)
* Add a general `Scraper` class.

* Setup main as primary entrypoint.

* Abstract original scraper into scraper class.

* Add better logging and cleaner bash commands.

* Ensure exporting works.
2023-11-30 15:15:15 -07:00
Joshua Potter 3cc31f8f24 Add guard on failed page download. 2023-11-28 07:57:05 -07:00
Joshua Potter fe2e504de9 Package into app for `nix build`. 2023-11-28 05:53:09 -07:00