mirror of
https://github.com/google-deepmind/deepmind-research.git
synced 2026-05-19 19:01:30 +08:00
Misc README fixes.
PiperOrigin-RevId: 379669157
This commit is contained in:
committed by
Saran Tunyasuvunakool
parent
4c80e527c4
commit
438d06513e
+2
-2
@@ -1,6 +1,6 @@
|
||||
# DeepMind entry for PCQM4M-LSC
|
||||
# DeepMind entry for OGB-LSC
|
||||
|
||||
This repository contains DeepMind's entry to the [PCWM4M-LSC](https://ogb.stanford.edu/kddcup2021/pcqm4m/) (quantum chemistry) and
|
||||
This repository contains DeepMind's entry to the [PCQM4M-LSC](https://ogb.stanford.edu/kddcup2021/pcqm4m/) (quantum chemistry) and
|
||||
[MAG240M-LSC](https://ogb.stanford.edu/kddcup2021/mag240m/) (academic graph)
|
||||
tracks of the [OGB Large-Scale Challenge](https://ogb.stanford.edu/kddcup2021/)
|
||||
(OGB-LSC).
|
||||
|
||||
+19
-8
@@ -60,7 +60,7 @@ See https://github.com/google/jax/issues/5231 for details.
|
||||
`ROOT`.**
|
||||
|
||||
**2. Run this script to reorganize the data into a flat directory structure with
|
||||
transparent names**
|
||||
transparent names.**
|
||||
|
||||
```bash
|
||||
/bin/bash organize_data.sh -r ROOT
|
||||
@@ -81,12 +81,13 @@ created, with contents:
|
||||
|
||||
We refer to this as the "raw" data.
|
||||
|
||||
**3. To run the preprocessing code**
|
||||
**3. Run the preprocessing code.**
|
||||
|
||||
```bash
|
||||
/bin/bash run_preprocessing.sh -r ROOT
|
||||
```
|
||||
|
||||
The pre-processing is very time- and memory-consuming, and should only be run
|
||||
The pre-processing is both time- and memory-consuming, and should only be run
|
||||
to verify the full pipeline. You can download the pre-processed data using the
|
||||
following script, for use in training and evaluating models:
|
||||
|
||||
@@ -99,11 +100,16 @@ python3 download_mag.py --task_root=${HOME}/mag --payload="data"
|
||||
|
||||
We have provided pre-trained weights of our final submission for convenience.
|
||||
They can be downloaded with:
|
||||
```
|
||||
|
||||
```bash
|
||||
python3 download_mag.py --task_root=${HOME}/mag --payload="models"
|
||||
```
|
||||
Then to reproduce our final results, please run `bash run_pretrain_eval.sh`.
|
||||
|
||||
Then to reproduce our final results, please run:
|
||||
|
||||
```bash
|
||||
/bin/bash run_preprocessing.sh -r ${HOME}/mag/
|
||||
```
|
||||
|
||||
## Retraining our model
|
||||
|
||||
@@ -111,9 +117,14 @@ Disclaimer: This script is provided for illustrative purposes. It is not
|
||||
practical for actual training since it only uses a single machine, and likely
|
||||
requires reducing the batch size and/or model size to fit on a single GPU.
|
||||
|
||||
If you still want to train a model, please run `run_training.sh`. To simply
|
||||
validate that the code is running correctly on your hardware setup, consider
|
||||
setting `debug=True` in `config.py`, which trains a smaller model.
|
||||
To train a model, please run:
|
||||
|
||||
```bash
|
||||
/bin/bash run_training.sh -r ${HOME}/mag/
|
||||
```
|
||||
|
||||
To simply validate that the code is running correctly on your hardware setup,
|
||||
consider setting `debug=True` in `config.py`, which trains a smaller model.
|
||||
|
||||
|
||||
# Citation
|
||||
|
||||
+24
-25
@@ -1,6 +1,6 @@
|
||||
# DeepMind entry for PCQM4M-LSC
|
||||
|
||||
This repository contains DeepMind's entry to the [PCWM4M-LSC](https://ogb.stanford.edu/kddcup2021/pcqm4m/) (quantum chemistry)
|
||||
This repository contains DeepMind's entry to the [PCQM4M-LSC](https://ogb.stanford.edu/kddcup2021/pcqm4m/) (quantum chemistry)
|
||||
track of the [OGB Large-Scale Challenge](https://ogb.stanford.edu/kddcup2021/)
|
||||
(OGB-LSC).
|
||||
|
||||
@@ -48,55 +48,54 @@ pip3 install --upgrade pip setuptools wheel
|
||||
pip3 install -r ogb_lsc/pcq/requirements.txt
|
||||
```
|
||||
|
||||
Use the following command to get a jaxlib version built compatible with V100 GPUs.
|
||||
```bash
|
||||
pip install --upgrade jax jaxlib==0.1.67+cuda110 -f https://storage.googleapis.com/jax-releases/jax_releases.html
|
||||
```
|
||||
See https://github.com/google/jax/issues/5231 for details.
|
||||
## Download and pre-process data
|
||||
|
||||
|
||||
## Downloading data and model weights
|
||||
|
||||
All necessary data and pre-trained model weights can be downloaded by running
|
||||
the following command.
|
||||
This downloads about ~ 150 GB worth of model checkpoints.
|
||||
All the additional features used in training (k-fold splits and conformer
|
||||
position features) can be generated by running:
|
||||
|
||||
```bash
|
||||
python download_required_pcq_data.py --data_root=${HOME}/data/
|
||||
/bin/bash run_preprocessing.sh -r ${HOME}/pcq/
|
||||
```
|
||||
|
||||
## Generating Pre-processed features
|
||||
Or downloaded using:
|
||||
|
||||
All the additional features used in training
|
||||
(k-fold splits and conformer position features) can be generated by running.
|
||||
```bash
|
||||
/bin/bash run_preprocessing.sh -r ${HOME}/data/pcq/
|
||||
python download_pcq.py --task_root=${HOME}/pcq/ --payload="data"
|
||||
```
|
||||
|
||||
## Reproducing our final results
|
||||
|
||||
We have provided pre-trained weights of our final submission for convenience. To
|
||||
reproduce our final results, please run `run_pretrained_eval.sh` as follows.
|
||||
We have provided pre-trained weights of our final submission (~150 GB worth of
|
||||
model checkpoints) for convenience, which can be downloaded with:
|
||||
|
||||
```bash
|
||||
/bin/bash run_pretrained_eval.sh -r ${HOME}/data/pcq/
|
||||
python download_pcq.py --task_root=${HOME}/pcq/ --payload="models"
|
||||
```
|
||||
|
||||
Then to reproduce our final results please run:
|
||||
|
||||
```bash
|
||||
/bin/bash run_pretrained_eval.sh -r ${HOME}/pcq/
|
||||
```
|
||||
|
||||
Note that this script does not use the downloaded conformer position features,
|
||||
and instead computes them for the test set as part of the script.
|
||||
|
||||
## Retraining our model
|
||||
|
||||
Disclaimer: This script is provided for illustrative purposes. It is not
|
||||
practical for actual training since it only uses a single machine, and likely
|
||||
requires reducing the batch size and/or model size to fit on a single GPU.
|
||||
|
||||
If you still want to train a model, please run `run_training.sh`. To simply
|
||||
validate that the code is running correctly on your hardware setup, consider
|
||||
setting `debug=True` in `config.py`, which trains a smaller model.
|
||||
|
||||
To train a model, please run:
|
||||
|
||||
```bash
|
||||
/bin/bash run_training.sh -r ${HOME}/data/pcq/
|
||||
/bin/bash run_training.sh -r ${HOME}/pcq/
|
||||
```
|
||||
|
||||
To simply validate that the code is running correctly on your hardware setup,
|
||||
consider setting `debug=True` in `config.py`, which trains a smaller model.
|
||||
|
||||
|
||||
# Citation
|
||||
|
||||
|
||||
Reference in New Issue
Block a user