DocumentationClassesDictVectorizer

Class: DictVectorizer

Transforms lists of feature-value mappings to vectors.

This transformer turns lists of mappings (dict-like objects) of feature names to feature values into Numpy arrays or scipy.sparse matrices for use with scikit-learn estimators.

When feature values are strings, this transformer will do a binary one-hot (aka one-of-K) coding: one boolean-valued feature is constructed for each of the possible string values that the feature can take on. For instance, a feature “f” that can take on the values “ham” and “spam” will become two features in the output, one signifying “f=ham”, the other “f=spam”.

If a feature value is a sequence or set of strings, this transformer will iterate over the values and will count the occurrences of each string value.

However, note that this transformer will only do a binary one-hot encoding when feature values are of type string. If categorical features are represented as numeric values such as int or iterables of strings, the DictVectorizer can be followed by OneHotEncoder to complete binary one-hot encoding.

Features that do not occur in a sample (mapping) will have a zero value in the resulting array/matrix.

For an efficiency comparison of the different feature extractors, see FeatureHasher and DictVectorizer Comparison.

Read more in the User Guide.

Python Reference

Constructors

new DictVectorizer()

new DictVectorizer(opts?): DictVectorizer

Parameters

ParameterTypeDescription
opts?object-
opts.dtype?anyThe type of feature values. Passed to Numpy array/scipy.sparse matrix constructors as the dtype argument.
opts.separator?stringSeparator string used when constructing new features for one-hot coding.
opts.sort?booleanWhether feature_names_ and vocabulary_ should be sorted when fitting.
opts.sparse?booleanWhether transform should produce scipy.sparse matrices.

Returns DictVectorizer

Defined in generated/feature_extraction/DictVectorizer.ts:35

Properties

PropertyTypeDefault valueDefined in
_isDisposedbooleanfalsegenerated/feature_extraction/DictVectorizer.ts:33
_isInitializedbooleanfalsegenerated/feature_extraction/DictVectorizer.ts:32
_pyPythonBridgeundefinedgenerated/feature_extraction/DictVectorizer.ts:31
idstringundefinedgenerated/feature_extraction/DictVectorizer.ts:28
optsanyundefinedgenerated/feature_extraction/DictVectorizer.ts:29

Accessors

feature_names_

Get Signature

get feature_names_(): Promise<any[]>

A list of length n_features containing the feature names (e.g., “f=ham” and “f=spam”).

Returns Promise<any[]>

Defined in generated/feature_extraction/DictVectorizer.ts:496


py

Get Signature

get py(): PythonBridge

Returns PythonBridge

Set Signature

set py(pythonBridge): void

Parameters

ParameterType
pythonBridgePythonBridge

Returns void

Defined in generated/feature_extraction/DictVectorizer.ts:66


vocabulary_

Get Signature

get vocabulary_(): Promise<any>

A dictionary mapping feature names to feature indices.

Returns Promise<any>

Defined in generated/feature_extraction/DictVectorizer.ts:471

Methods

dispose()

dispose(): Promise<void>

Disposes of the underlying Python resources.

Once dispose() is called, the instance is no longer usable.

Returns Promise<void>

Defined in generated/feature_extraction/DictVectorizer.ts:118


fit()

fit(opts): Promise<any>

Learn a list of feature name -> indices mappings.

Parameters

ParameterTypeDescription
optsobject-
opts.X?anyDict(s) or Mapping(s) from feature names (arbitrary Python objects) to feature values (strings or convertible to dtype).
opts.y?anyIgnored parameter.

Returns Promise<any>

Defined in generated/feature_extraction/DictVectorizer.ts:135


fit_transform()

fit_transform(opts): Promise<any>

Learn a list of feature name -> indices mappings and transform X.

Like fit(X) followed by transform(X), but does not require materializing X in memory.

Parameters

ParameterTypeDescription
optsobject-
opts.X?anyDict(s) or Mapping(s) from feature names (arbitrary Python objects) to feature values (strings or convertible to dtype).
opts.y?anyIgnored parameter.

Returns Promise<any>

Defined in generated/feature_extraction/DictVectorizer.ts:174


get_feature_names_out()

get_feature_names_out(opts): Promise<any>

Get output feature names for transformation.

Parameters

ParameterTypeDescription
optsobject-
opts.input_features?anyNot used, present here for API consistency by convention.

Returns Promise<any>

Defined in generated/feature_extraction/DictVectorizer.ts:211


get_metadata_routing()

get_metadata_routing(opts): Promise<any>

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Parameters

ParameterTypeDescription
optsobject-
opts.routing?anyA MetadataRequest encapsulating routing information.

Returns Promise<any>

Defined in generated/feature_extraction/DictVectorizer.ts:247


init()

init(py): Promise<void>

Initializes the underlying Python resources.

This instance is not usable until the Promise returned by init() resolves.

Parameters

ParameterType
pyPythonBridge

Returns Promise<void>

Defined in generated/feature_extraction/DictVectorizer.ts:79


inverse_transform()

inverse_transform(opts): Promise<any[]>

Transform array or sparse matrix X back to feature mappings.

X must have been produced by this DictVectorizer’s transform or fit_transform method; it may only have passed through transformers that preserve the number of features and their order.

In the case of one-hot/one-of-K coding, the constructed feature names and values are returned rather than the original ones.

Parameters

ParameterTypeDescription
optsobject-
opts.dict_type?anyConstructor for feature mappings. Must conform to the collections.Mapping API.
opts.X?ArrayLikeSample matrix.

Returns Promise<any[]>

Defined in generated/feature_extraction/DictVectorizer.ts:285


restrict()

restrict(opts): Promise<any>

Restrict the features to those in support using feature selection.

This function modifies the estimator in-place.

Parameters

ParameterTypeDescription
optsobject-
opts.indices?booleanWhether support is a list of indices.
opts.support?ArrayLikeBoolean mask or list of indices (as returned by the get_support member of feature selectors).

Returns Promise<any>

Defined in generated/feature_extraction/DictVectorizer.ts:326


set_inverse_transform_request()

set_inverse_transform_request(opts): Promise<any>

Request metadata passed to the inverse_transform method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

Parameters

ParameterTypeDescription
optsobject-
opts.dict_type?string | booleanMetadata routing for dict_type parameter in inverse_transform.

Returns Promise<any>

Defined in generated/feature_extraction/DictVectorizer.ts:369


set_output()

set_output(opts): Promise<any>

Set output container.

See Introducing the set_output API for an example on how to use the API.

Parameters

ParameterTypeDescription
optsobject-
opts.transform?"default" | "pandas" | "polars"Configure output of transform and fit_transform.

Returns Promise<any>

Defined in generated/feature_extraction/DictVectorizer.ts:405


transform()

transform(opts): Promise<any>

Transform feature->value dicts to array or sparse matrix.

Named features not encountered during fit or fit_transform will be silently ignored.

Parameters

ParameterTypeDescription
optsobject-
opts.X?any[]Dict(s) or Mapping(s) from feature names (arbitrary Python objects) to feature values (strings or convertible to dtype).

Returns Promise<any>

Defined in generated/feature_extraction/DictVectorizer.ts:439