Ensemble AI API Docs
  • Welcome to Ensemble!
  • Dark Matter
    • DarkMatterType
    • Fit
    • Generate
    • Save
    • Load
  • References
    • Explainability Mode
    • Ray for Dark Matter
    • The Dark Matter Environment
Powered by GitBook
On this page
  • Parameters
  • input_column_size: int
  • output_column_size: int
  • task_target_size: int
  • task: str
  • X: DarkMatterType
  • y: DarkMatterType
  • explainability: bool, default=False
  • column_names: list[str] | None, default=None
  • project_name: str | None, default=None
  • max_iter: int, default=10,000
  • batch_size: int | float = 1000
  • Methods
  • fit
  • generate
  • save
  • load
  • Notes
  • Product Updates Coming Soon

Dark Matter

Ensemble AI's Core IP algorithm for generating highly predictive embeddings for any machine learning dataset.

class DarkMatter(input_column_size: int,
                 task_target_size: int,
                 output_column_size: int,
                 task: str,
                 X: ,
                 y: ,
                 : bool = False,
                 column_names: list[str] | None = None,
                 project_name: str | None = None,
                 max_iter: int = 10_000,
                 batch_size: int | float = 1000)

Parameters

input_column_size: int

The total number of input features.

output_column_size: int

The number of desired output "features" (embedding length). Must be a number between 1 and 500, inclusive.

task_target_size: int

The total number of targets. Currently, only 1 is supported.

task: str

One of "regression" or "classification".

The input data feature set.

The input data targets.

column_names: list[str] | None, default=None

Optional feature names for use with explainability mode. If none are specified, either the names are inferred from the X argument if it is a Pandas DataFrame or will be generic names ["feature_0", "feature_1", ...].

project_name: str | None, default=None

Optional string for use to save algorithm weights to disk. The default is a "YY-MM-DD_hh-mm-ss" string based on the environment clock.

max_iter: int, default=10,000

The total number of training iterations to use during algorithm fitting. This must be greater than the sensitivity argument.

batch_size: int | float = 1000

The number of training examples to use for each batch during fitting. Floating point arguments must be a number between 0 and 1 to indicate the percentage of the input data to be used as a batch.

Methods

fit

def fit(self, src_path: str = "./src") -> "DarkMatter"

Fits the algorithm to the input data.

generate

def generate(self, X: Iterable) -> DarkMatterType

Transforms a given dataset to embeddings using Dark Matter.

save

def save(self, *, path: str = "weights", project_name: str | None = None) -> str

Saves the algorithm weights to a subfolder "{path}/{project_name}/".

load

def load(self, *, path: str = "weights", project_name: str | None = None) -> "DarkMatter"

Loads the algorithm's weights from a subfolder "{path}/{project_name}/".

Notes

  • NaN values are not compatible with algorithm training, these must be filled prior to calling fit or generate.

  • Dark Matter utilizes a variety of Data Science and backend Python packages for algorithm training and source code management. Learn more at The Dark Matter Environment.

Product Updates Coming Soon

  • GPU support.

  • Training progress and insights.

  • Automatic hyperparameter tuning.

PreviousWelcome to Ensemble!NextDarkMatterType

Last updated 7 months ago

X:

y:

: bool, default=False

Enable/disable .

DarkMatterType
DarkMatterType
explainability
explainability mode