Add README descriptions.
parent
932d97bf3b
commit
1a4199e95d
48
README.md
48
README.md
|
@ -1,14 +1,48 @@
|
||||||
# Poetry Flake Template
|
# chesscom-scraper
|
||||||
|
|
||||||
This is a template for constructing a working environment for Python (version
|
**Caution! Be careful running this script.**
|
||||||
3.11.6) development. Packaging and dependency management relies on [poetry](https://python-poetry.org/)
|
|
||||||
(version 1.7.0). [direnv](https://direnv.net/) can be used to a launch a dev
|
We intentionally delay each request sent anywhere from 10 to 15 seconds. Make
|
||||||
shell upon entering this directory (refer to `.envrc`). Otherwise run via:
|
sure any adjustments to this script appropriately rate-limit.
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
This is a simple web scraper for [chess.com](https://www.chess.com/coaches)
|
||||||
|
coaches. Running:
|
||||||
|
```bash
|
||||||
|
$> python3 main.py
|
||||||
|
```
|
||||||
|
will query [chess.com](https://www.chess.com) for all listed coaches as well as
|
||||||
|
specific information about each of them (their profile, recent activity, and
|
||||||
|
stats). The result will be found in a newly created `data` directory with the
|
||||||
|
following structure:
|
||||||
|
```
|
||||||
|
data
|
||||||
|
├── coach
|
||||||
|
│ ├── <member_name>
|
||||||
|
│ │ ├── <member_name>.html
|
||||||
|
│ │ ├── activity.json
|
||||||
|
│ │ └── stats.json
|
||||||
|
│ ├── ...
|
||||||
|
└── pages
|
||||||
|
├── <n>.txt
|
||||||
|
├── ...
|
||||||
|
```
|
||||||
|
|
||||||
|
Here, `member_name` corresponds to the name of the coach whereas `pages`
|
||||||
|
contains a fragmented list of URLs to coach profiles.
|
||||||
|
|
||||||
|
## Development
|
||||||
|
|
||||||
|
This script was written using Python (version 3.11.6). Packaging and dependency
|
||||||
|
management relies on [poetry](https://python-poetry.org/) (version 1.7.0).
|
||||||
|
[direnv](https://direnv.net/) can be used to a launch a dev shell upon entering
|
||||||
|
this directory (refer to `.envrc`). Otherwise run via:
|
||||||
```bash
|
```bash
|
||||||
$> nix develop
|
$> nix develop
|
||||||
```
|
```
|
||||||
|
|
||||||
## Language Server
|
### Language Server
|
||||||
|
|
||||||
The [python-lsp-server](https://github.com/python-lsp/python-lsp-server)
|
The [python-lsp-server](https://github.com/python-lsp/python-lsp-server)
|
||||||
(version v1.9.0) is included in this flake, along with the [python-lsp-black](https://github.com/python-lsp/python-lsp-black)
|
(version v1.9.0) is included in this flake, along with the [python-lsp-black](https://github.com/python-lsp/python-lsp-black)
|
||||||
|
@ -17,7 +51,7 @@ plugin for formatting purposes. `pylsp` is expected to be configured to use
|
||||||
and [pyflakes](https://github.com/PyCQA/pyflakes). Refer to your editor for
|
and [pyflakes](https://github.com/PyCQA/pyflakes). Refer to your editor for
|
||||||
configuration details.
|
configuration details.
|
||||||
|
|
||||||
## Formatting
|
### Formatting
|
||||||
|
|
||||||
Formatting depends on the [black](https://black.readthedocs.io/en/stable/index.html)
|
Formatting depends on the [black](https://black.readthedocs.io/en/stable/index.html)
|
||||||
(version 23.9.1) tool. A `pre-commit` hook is included in `.githooks` that can
|
(version 23.9.1) tool. A `pre-commit` hook is included in `.githooks` that can
|
||||||
|
|
Loading…
Reference in New Issue