Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add faq #197

Merged
merged 4 commits into from
Mar 11, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/_toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -45,8 +45,8 @@ chapters:
- file: reference/simulations/projectile
- file: reference/emulators/index
sections:
- file: reference/emulators/gaussian_process_sk
- file: reference/emulators/gaussian_process
- file: reference/emulators/gaussian_process_mogp
- file: reference/emulators/gradient_boosting
- file: reference/emulators/neural_net_sk
- file: reference/emulators/neural_net_torch
Expand Down
2 changes: 2 additions & 0 deletions docs/community/faq/faq-contributors.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# First-Time Contributors' Frequently Asked Questions

**TODO**
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think is needed here? An intro to the page, perhaps?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading through the answers, perhaps we should add "You need to be familiar with..." and then list what prerequisites there are for the package. For instance, a good understanding of machine learning concepts, and NumPy as a package?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just put a todo to include it but show that it's not ready, and I'm not sure we can get this done this week.

About the prerequisites list: Yes, I think that's a good idea!


## Getting Started

1. How can I contribute to AutoEmulate?
Expand Down
56 changes: 36 additions & 20 deletions docs/community/faq/faq-users.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,69 +2,85 @@

## General Questions

1. What is AutoEmulate?
1. What is `AutoEmulate`?
<!-- A brief description of what the package does, its main features, and its intended use case. -->
- A Python package that makes it easy to build emulators for complex simulations. It takes a set of simulation inputs `X` and outputs `y`, and automatically fits, optimises and evaluates various machine learning models to find the best emulator model. The emulator model can then be used as a drop-in replacement for the simulation, but will be much faster and computationally cheaper to evaluate.

2. How do I install AutoEmulate?
2. How do I install `AutoEmulate`?
<!-- Step-by-step instructions on installing the package, including any dependencies that might be required. -->
- See the [installation guide](../../getting-started/installation.md) for detailed instructions.

3. What are the prerequisites for using AutoEmulate?
3. What are the prerequisites for using `AutoEmulate`?
<!-- Information on the knowledge or data required to effectively use AutoEmulate, such as familiarity with Python, machine learning concepts, or specific data formats. -->
- `AutoEmulate` is designed to be easy to use. The user has to first generat a dataset of simulation inputs `X` and outputs `y`, and optimally have a basic understanding of Python and machine learning concepts.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generat --> generate


## Usage Questions

1. How do I start using AutoEmulate with my data?
1. How do I start using `AutoEmulate` with my simulation?
<!-- A simple example to get a new user started, possibly pointing to more detailed tutorials or documentation. -->
- See the [getting started guide](../../getting-started/quickstart.ipynb) or a more [in-depth tutorial](../../tutorials/01_start.ipynb).

2. What kind of data can I analyze with AutoEmulate?
2. What kind of data does `AutoEmulate` need to build an emulator?
<!-- Clarification on the types of datasets suitable for analysis, including data formats and recommended data sizes. -->

3. How do I interpret the results from AutoEmulate?
- `AutoEmulate` takes simulation inputs `X` and simulation outputs `y` to build an emulator.`X` is an ndarray of shape `(n_samples, n_parameters)` and `y` is an ndarray of shape `(n_samples, n_outputs)`. Each sample here is a simulation run, so each row of `X` corresponds to a set of input parameters and each row of `y` corresponds to the corresponding simulation output. Currently, all inputs and outputs should be numeric, and we don't support missing data.

- All models work with multi-output data. We have optimised `AutoEmulate` to work with smaller datasets (in the order of hundreds to thousands of samples). Training emulators with large datasets (hundreds of thousands of samples) may currently require a long time and is not recommended.

3. How do I interpret the results from `AutoEmulate`?
<!-- Guidance on understanding the output of the software, including any metrics or visualizations it produces. -->
- See the [tutorial](../../tutorials/01_start.ipynb) for an example of how to interpret the results from `AutoEmulate`. Briefly, `X` and `y` are first split into training and test sets. Cross-validation and/or hyperparameter optimisation are performed on the training data. After comparing the results from different emulators, the user can evaluate the chosen emulator on the test set with `autoemulate.evaluate_model()`, and plot test set predictions with `autoemulate.plot_model()`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could link autoemulate.evaluate_model() and autoemulate.plot_model() here to the API docs maybe?

Copy link
Collaborator Author

@mastoffel mastoffel Mar 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, doing that now. @kallewesterling do you know whether there is a way to directly link to a class method (instead of just the class?)


- An important thing to note is that the emulator can only be as good as the data it was trained on. Therefore, the experimental design (on which points the simulation was evaluated) is key to obtaining a good emulator.

4. Can I use AutoEmulate for commercial purposes?
4. Can I use `AutoEmulate` for commercial purposes?
<!-- Information on licensing and any restrictions on use. -->
- Yes. It's licensed under the MIT license, which allows for commercial use. See the [license](../../../LICENSE) for more information.

## Advanced Usage

1. How can I customize simulations in AutoEmulate?
<!-- Explanation of how users can adjust parameters or settings to tailor simulations to their specific research questions. -->

2. Does AutoEmulate support parallel processing or high-performance computing (HPC) environments?
1. Does AutoEmulate support parallel processing or high-performance computing (HPC) environments?
<!-- Details on the software's capabilities to leverage multi-threading, distributed computing, or HPC resources to speed up computations. -->
- Yes, `setup()` has an `n_jobs` parameter which allows to parallelise cross-validation and hyperparameter optimisation.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's link setup() to the API docs like above!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linking now to the autoemulate.compare module, as I'm currently not sure how to link to a method within the module. I think this needs modification of the rst file, but i don't know whether this should be done manually.


3. Can AutoEmulate be integrated with other data analysis or simulation tools?
2. Can AutoEmulate be integrated with other data analysis or simulation tools?
<!-- Information on APIs, file formats, or protocols that facilitate the integration of AutoEmulate with other software ecosystems. -->
- `AutoEmulate` takes simple `X` and `y` ndarrays as input, and returns emulator models that can be saved and loaded with `joblib`. All emulators are written as scikit learn estimators, so they can be used like any other scikit learn model in a pipeline.

## Data Handling

1. What are the best practices for data preprocessing before using AutoEmulate?
1. What are the best practices for data preprocessing before using `AutoEmulate`?
<!-- Tips and recommendations on preparing data, including normalisation, dealing with missing values, or data segmentation. -->
- The user will typically run their simulation on a selected set of input parameters (-> experimental design) using a latin hypercube or other sampling method. `AutoEmulate` currently needs all inputs to be numeric and we don't support missing data. By default, `AutoEmulate` will scale the input data to zero mean and unit variance, and there's the option to do dimensionality reduction in `setup()`.

2. How does AutoEmulate handle large datasets?
<!-- Advice on managing large-scale data analyses, potential memory management features, or ways to streamline processing. -->

3. Can I use AutoEmulate for real-time data analysis?
<!-- Insights into the software's ability to process data in real-time and any limitations or considerations. -->
- `AutoEmulate` is optimised to work with smaller datasets (in the order of hundreds to thousands of samples). Training emulators with large datasets (hundreds of thousands of samples) may currently require a long time and is not recommended. Emulators are created because it's expensive to evaluate the simulation, so we expect most users to have a relatively small dataset.

## Troubleshooting

1. What common issues might I encounter when using AutoEmulate, and how can I solve them?
1. What common issues might I encounter when using `AutoEmulate`, and how can I solve them?
<!-- A list of frequently encountered problems with suggested solutions, possibly linked to a more extensive troubleshooting guide. -->
- TODO
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe @bryanlimy can be helpful here in coming up with commonly-run-into problems?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we have an option to adjust the verbosity of the codebase than that would be one way to see the underlying error/issue. RIght now I don't think there is easy way to debug the package (from a user perspective) without digging into the source code of both the codebase and the packages we use.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR #200 now adds the option to log everything to a file or to print all logs to screen, and should capture the complete traceback when models fail. Will add this here.


2. How can I report a bug or request a feature in AutoEmulate?
2. How can I report a bug or request a feature in `AutoEmulate`?
<!-- Instructions on the proper channels for reporting issues or suggesting enhancements, including any templates or information to include. -->
- Please open an issue using the [bug issue template](https://github.com/alan-turing-institute/autoemulate/issues/new/choose).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually links to the page where the user can open a bug issue report OR request a new feature, so good to have the link here but suggestion to write something like (this isn't great so feel free to edit): "You can report a bug or request new features through the issue templates in our GitHub repository. Head on over there and choose one of the templates for your purpose and get started."

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, agree, changed the text to your suggestion!


## Community and Learning Resources

1. Are there any community projects or collaborations using AutoEmulate I can join or learn from?
1. Are there any community projects or collaborations using `AutoEmulate` I can join or learn from?
<!-- Information on community-led projects, study groups, or collaborative research initiatives involving AutoEmulate. -->
- TODO
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always good to connect people up.. I wonder if we could mention for folks to reach out if they want invites to the Slack channel. That might be helpful for others, and we could invite them as guests to the channel.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we invite people from outside the Turing to the Slack channel?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a sentence to reach out to you or me for now!


2. Where can I find tutorials or case studies on using AutoEmulate?
2. Where can I find tutorials or case studies on using `AutoEmulate`?
<!-- Directions to comprehensive learning materials, such as video tutorials (if we want to record that), written guides, or published research papers using AutoEmulate. -->
- See the [tutorials](../../tutorials/01_start.ipynb) for a comprehensive guide on using the package.

3. How can I stay updated on new releases or updates to AutoEmulate?
<!-- Guidance on subscribing to newsletters when/if we will have that, community calls if we start that, following the project on social media if we want to create those platforms, or joining community forums/Slack once we have that ready... -->
- Watch the [AutoEmulate repository](https://github.com/alan-turing-institute/autoemulate).

4. What support options are available if I need help with AutoEmulate?
<!-- Overview of support resources, including documentation, community forums/Slack when we have that ready... -->
- Please open an issue or contact the maintainer at [email](mailto:[email protected]) directly.
Binary file modified docs/getting-started/best_model
Binary file not shown.
2 changes: 1 addition & 1 deletion docs/getting-started/best_model_meta.json
Original file line number Diff line number Diff line change
@@ -1 +1 @@
{"model": "GaussianProcessSk", "scikit-learn": "1.3.2", "numpy": "1.23.5"}
{"model": "GaussianProcess", "scikit-learn": "1.3.2", "numpy": "1.23.5"}
4 changes: 2 additions & 2 deletions docs/getting-started/installation.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
# Installation instructions

AutoEmulate is a Python package that can be installed in a number of ways. In this section we will describe the main ways to install the package.
`AutoEmulate` is a Python package that can be installed in a number of ways. In this section we will describe the main ways to install the package.

## Install from PyPI

This is the easiest way to install AutoEmulate.
This is the easiest way to install `AutoEmulate`.

Currently, because we are in active development, you have to install the development version from GitHub:

Expand Down
6 changes: 6 additions & 0 deletions docs/reference/emulators/gaussian_process_mogp.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
autoemulate.emulators.gaussian_process_mogp
===========================================

.. automodule:: autoemulate.emulators.gaussian_process_mogp
:members:
:show-inheritance:
6 changes: 0 additions & 6 deletions docs/reference/emulators/gaussian_process_sk.rst

This file was deleted.

Loading
Loading