pyspark.pandas.to_numeric¶
-
pyspark.pandas.to_numeric(arg, errors='raise')[source]¶ Convert argument to a numeric type.
- Parameters
- argscalar, list, tuple, 1-d array, or Series
Argument to be converted.
- errors{‘raise’, ‘coerce’}, default ‘raise’
If ‘coerce’, then invalid parsing will be set as NaN.
If ‘raise’, then invalid parsing will raise an exception.
If ‘ignore’, then invalid parsing will return the input.
Note
‘ignore’ doesn’t work yet when arg is pandas-on-Spark Series.
- Returns
- retnumeric if parsing succeeded.
See also
DataFrame.astypeCast argument to a specified dtype.
to_datetimeConvert argument to datetime.
to_timedeltaConvert argument to timedelta.
numpy.ndarray.astypeCast a numpy array to a specified type.
Examples
>>> psser = ps.Series(['1.0', '2', '-3']) >>> psser 0 1.0 1 2 2 -3 dtype: object
>>> ps.to_numeric(psser) 0 1.0 1 2.0 2 -3.0 dtype: float32
If given Series contains invalid value to cast float, just cast it to np.nan when errors is set to “coerce”.
>>> psser = ps.Series(['apple', '1.0', '2', '-3']) >>> psser 0 apple 1 1.0 2 2 3 -3 dtype: object
>>> ps.to_numeric(psser, errors="coerce") 0 NaN 1 1.0 2 2.0 3 -3.0 dtype: float32
Also support for list, tuple, np.array, or a scalar
>>> ps.to_numeric(['1.0', '2', '-3']) array([ 1., 2., -3.])
>>> ps.to_numeric(('1.0', '2', '-3')) array([ 1., 2., -3.])
>>> ps.to_numeric(np.array(['1.0', '2', '-3'])) array([ 1., 2., -3.])
>>> ps.to_numeric('1.0') 1.0