Class: DictVectorizer

Transforms lists of feature-value mappings to vectors.

This transformer turns lists of mappings (dict-like objects) of feature names to feature values into Numpy arrays or scipy.sparse matrices for use with scikit-learn estimators.

When feature values are strings, this transformer will do a binary one-hot (aka one-of-K) coding: one boolean-valued feature is constructed for each of the possible string values that the feature can take on. For instance, a feature “f” that can take on the values “ham” and “spam” will become two features in the output, one signifying “f=ham”, the other “f=spam”.

If a feature value is a sequence or set of strings, this transformer will iterate over the values and will count the occurrences of each string value.

However, note that this transformer will only do a binary one-hot encoding when feature values are of type string. If categorical features are represented as numeric values such as int or iterables of strings, the DictVectorizer can be followed by OneHotEncoder to complete binary one-hot encoding.

Features that do not occur in a sample (mapping) will have a zero value in the resulting array/matrix.

For an efficiency comparison of the different feature extractors, see FeatureHasher and DictVectorizer Comparison.

Constructors

new DictVectorizer()

new DictVectorizer(opts?): DictVectorizer

Parameters

Parameter	Type	Description
`opts`?	`object`	-
`opts.dtype`?	`any`	The type of feature values. Passed to Numpy array/scipy.sparse matrix constructors as the dtype argument.
`opts.separator`?	`string`	Separator string used when constructing new features for one-hot coding.
`opts.sort`?	`boolean`	Whether `feature_names_` and `vocabulary_` should be sorted when fitting.
`opts.sparse`?	`boolean`	Whether transform should produce scipy.sparse matrices.

Returns DictVectorizer

Defined in generated/feature_extraction/DictVectorizer.ts:35

Properties

Property	Type	Default value	Defined in
`_isDisposed`	`boolean`	`false`	generated/feature_extraction/DictVectorizer.ts:33
`_isInitialized`	`boolean`	`false`	generated/feature_extraction/DictVectorizer.ts:32
`_py`	`PythonBridge`	`undefined`	generated/feature_extraction/DictVectorizer.ts:31
`id`	`string`	`undefined`	generated/feature_extraction/DictVectorizer.ts:28
`opts`	`any`	`undefined`	generated/feature_extraction/DictVectorizer.ts:29

Accessors

feature_names_

Get Signature

get feature_names_(): Promise<any[]>

A list of length n_features containing the feature names (e.g., “f=ham” and “f=spam”).

Returns Promise<any[]>

Defined in generated/feature_extraction/DictVectorizer.ts:496

py

Get Signature

get py(): PythonBridge

Returns PythonBridge

Set Signature

set py(pythonBridge): void

Parameters

Parameter	Type
`pythonBridge`	`PythonBridge`

Returns void

Defined in generated/feature_extraction/DictVectorizer.ts:66

vocabulary_

Get Signature

get vocabulary_(): Promise<any>

A dictionary mapping feature names to feature indices.

Returns Promise<any>

Defined in generated/feature_extraction/DictVectorizer.ts:471

Methods

dispose()

dispose(): Promise<void>

Disposes of the underlying Python resources.

Once dispose() is called, the instance is no longer usable.

Returns Promise<void>

Defined in generated/feature_extraction/DictVectorizer.ts:118

fit()

fit(opts): Promise<any>

Learn a list of feature name -> indices mappings.

Parameters

Parameter	Type	Description
`opts`	`object`	-
`opts.X`?	`any`	Dict(s) or Mapping(s) from feature names (arbitrary Python objects) to feature values (strings or convertible to dtype).
`opts.y`?	`any`	Ignored parameter.

Returns Promise<any>

Defined in generated/feature_extraction/DictVectorizer.ts:135

fit_transform()

fit_transform(opts): Promise<any>

Learn a list of feature name -> indices mappings and transform X.

Like fit(X) followed by transform(X), but does not require materializing X in memory.

Parameters

Parameter	Type	Description
`opts`	`object`	-
`opts.X`?	`any`	Dict(s) or Mapping(s) from feature names (arbitrary Python objects) to feature values (strings or convertible to dtype).
`opts.y`?	`any`	Ignored parameter.

Returns Promise<any>

Defined in generated/feature_extraction/DictVectorizer.ts:174

get_feature_names_out()

get_feature_names_out(opts): Promise<any>

Get output feature names for transformation.

Parameters

Parameter	Type	Description
`opts`	`object`	-
`opts.input_features`?	`any`	Not used, present here for API consistency by convention.

Returns Promise<any>

Defined in generated/feature_extraction/DictVectorizer.ts:211

get_metadata_routing()

get_metadata_routing(opts): Promise<any>

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Parameters

Parameter	Type	Description
`opts`	`object`	-
`opts.routing`?	`any`	A `MetadataRequest` encapsulating routing information.

Returns Promise<any>

Defined in generated/feature_extraction/DictVectorizer.ts:247

init()

init(py): Promise<void>

Initializes the underlying Python resources.

This instance is not usable until the Promise returned by init() resolves.

Parameters

Parameter	Type
`py`	`PythonBridge`

Returns Promise<void>

Defined in generated/feature_extraction/DictVectorizer.ts:79

inverse_transform()

inverse_transform(opts): Promise<any[]>

Transform array or sparse matrix X back to feature mappings.

X must have been produced by this DictVectorizer’s transform or fit_transform method; it may only have passed through transformers that preserve the number of features and their order.

In the case of one-hot/one-of-K coding, the constructed feature names and values are returned rather than the original ones.

Parameters

Parameter	Type	Description
`opts`	`object`	-
`opts.dict_type`?	`any`	Constructor for feature mappings. Must conform to the collections.Mapping API.
`opts.X`?	`ArrayLike`	Sample matrix.

Returns Promise<any[]>

Defined in generated/feature_extraction/DictVectorizer.ts:285

restrict()

restrict(opts): Promise<any>

Restrict the features to those in support using feature selection.

This function modifies the estimator in-place.

Parameters

Parameter	Type	Description
`opts`	`object`	-
`opts.indices`?	`boolean`	Whether support is a list of indices.
`opts.support`?	`ArrayLike`	Boolean mask or list of indices (as returned by the get_support member of feature selectors).

Returns Promise<any>

Defined in generated/feature_extraction/DictVectorizer.ts:326

set_inverse_transform_request()

set_inverse_transform_request(opts): Promise<any>

Request metadata passed to the inverse_transform method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

Parameters

Parameter	Type	Description
`opts`	`object`	-
`opts.dict_type`?	`string` \| `boolean`	Metadata routing for `dict_type` parameter in `inverse_transform`.

Returns Promise<any>

Defined in generated/feature_extraction/DictVectorizer.ts:369

set_output()

set_output(opts): Promise<any>

Set output container.

See Introducing the set_output API for an example on how to use the API.

Parameters

Parameter	Type	Description
`opts`	`object`	-
`opts.transform`?	`"default"` \| `"pandas"` \| `"polars"`	Configure output of `transform` and `fit_transform`.

Returns Promise<any>

Defined in generated/feature_extraction/DictVectorizer.ts:405

transform()

transform(opts): Promise<any>

Transform feature->value dicts to array or sparse matrix.

Named features not encountered during fit or fit_transform will be silently ignored.

Parameters

Parameter	Type	Description
`opts`	`object`	-
`opts.X`?	`any`[]	Dict(s) or Mapping(s) from feature names (arbitrary Python objects) to feature values (strings or convertible to dtype).

Returns Promise<any>

Defined in generated/feature_extraction/DictVectorizer.ts:439

Last updated on November 21, 2024

DictionaryLearning DistanceMetric