### Unit 2 - Operations on a Series

CBSE Revision Notes
Class-11 Informatics Practices (New Syllabus)
Unit 2: Data Handling (DH-1)

Operations on a Series

Series is a one-dimensional labeled array capable of holding data of any type (integer, string, float, python objects, etc.). The axis labels are collectively called index.

pandas.Series

A pandas Series can be created using the following constructor −

`pandas.Series( data, index, dtype, copy)`

The parameters of the constructor are as follows −

S.NoParameter & Description
1data - data takes various forms like ndarray, list, constants
2index - Index values must be unique and hashable, same length as data. Default np.arrange(n) if no index is passed.
3dtype - dtype is for data type. If None, data type will be inferred
4copy - Copy data. Default False

A series can be created using various inputs like −

• Array
• Dict
• Scalar value or constant

Create an Empty Series

A basic series, which can be created is an Empty Series.

Example

`#import the pandas library and aliasing as pdimport pandas as pds = pd.Series()print s`

Its output is as follows −

`Series([], dtype: float64)`

Create a Series from ndarray

If data is an ndarray, then index passed must be of the same length. If no index is passed, then by default index will be range(n) where n is array length, i.e., [0,1,2,3…. range(len(array))-1].

Example 1

`#import the pandas library and aliasing as pdimport pandas as pdimport numpy as npdata = np.array(['a','b','c','d'])s = pd.Series(data)print s`

Its output is as follows −

`0  a1  b2  c3  ddtype: object`

We did not pass any index, so by default, it assigned the indexes ranging from 0 to len(data)-1, i.e., 0 to 3.

Example 2

`#import the pandas library and aliasing as pdimport pandas as pdimport numpy as npdata = np.array(['a','b','c','d'])s = pd.Series(data,index=[100,101,102,103])print s`

Its output is as follows −

`100 a101 b102 c103 ddtype: object`

We passed the index values here. Now we can see the customized indexed values in the output.

Create a Series from dict

dict can be passed as input and if no index is specified, then the dictionary keys are taken in a sorted order to construct index. If index is passed, the values in data corresponding to the labels in the index will be pulled out.

Example 1

`#import the pandas library and aliasing as pdimport pandas as pdimport numpy as npdata = {'a' : 0., 'b' : 1., 'c' : 2.}s = pd.Series(data)print s`

Its output is as follows −

`a 0.0b 1.0c 2.0dtype: float64`

Observe − Dictionary keys are used to construct index.

Example 2

`#import the pandas library and aliasing as pdimport pandas as pdimport numpy as npdata = {'a' : 0., 'b' : 1., 'c' : 2.}s = pd.Series(data,index=['b','c','d','a'])print s`

Its output is as follows −

`b 1.0c 2.0d NaNa 0.0dtype: float64`

Observe − Index order is persisted and the missing element is filled with NaN (Not a Number).

Create a Series from Scalar

If data is a scalar value, an index must be provided. The value will be repeated to match the length of index

`#import the pandas library and aliasing as pdimport pandas as pdimport numpy as nps = pd.Series(5, index=[0, 1, 2, 3])print s`

Its output is as follows −

`0 51 52 53 5dtype: int64`

`Series.``head`(n=5)

Return the first n rows.

This function returns the first n rows for the object based on position. It is useful for quickly testing if your object has the right type of data in it.

 Parameters: n : int, default 5Number of rows to select. Returns: obj_head : type of callerThe first n rows of the caller object.

Returns the last n rows.

Examples

`>>> df = pd.DataFrame({'animal':['alligator', 'bee', 'falcon', 'lion','monkey', 'parrot', 'shark', 'whale', 'zebra']})>>> df animal0 alligator1  bee2 falcon3  lion4 monkey5 parrot6 shark7 whale8 zebra`

Viewing the first 5 lines

`>>> df.head() animal0 alligator1  bee2 falcon3  lion4 monkey`

Viewing the first n lines (three in this case)

`>>> df.head(3) animal0 alligator1  bee2 falcon`

pandas.Series.tail

`Series.``tail`(n=5)

Return the last n rows.

This function returns last n rows from the object based on position. It is useful for quickly verifying data, for example, after sorting or appending rows.

 Parameters: n : int, default 5Number of rows to select. Returns: type of callerThe last n rows of the caller object.

The first n rows of the caller object.

Examples

`>>> df = pd.DataFrame({'animal':['alligator', 'bee', 'falcon', 'lion','monkey', 'parrot', 'shark', 'whale', 'zebra']})>>> df animal0 alligator1  bee2 falcon3  lion4 monkey5 parrot6 shark7 whale8 zebra`

Viewing the last 5 lines

`>>> df.tail()  animal4 monkey5 parrot6  shark7  whale8  zebra`

Viewing the last n lines (three in this case)

`>>> df.tail(3) animal6 shark7 whale8 zebra`

Here we discuss a lot of the essential functionality common to the pandas data structures. Here’s how to create some of the objects used in the examples from the previous section:

`In [1]: index = pd.date_range('1/1/2000', periods=8)In [2]: s = pd.Series(np.random.randn(5), index=['a', 'b', 'c', 'd', 'e'])In [3]: df = pd.DataFrame(np.random.randn(8, 3), index=index,  ...:    columns=['A', 'B', 'C'])  ...: In [4]: wp = pd.Panel(np.random.randn(2, 5, 4), items=['Item1', 'Item2'],  ...:    major_axis=pd.date_range('1/1/2000', periods=5),  ...:    minor_axis=['A', 'B', 'C', 'D'])  ...: `

To view a small sample of a Series or DataFrame object, use the `head()` and `tail()` methods. The default number of elements to display is five, but you may pass a custom number.

`In [5]: long_series = pd.Series(np.random.randn(1000))In [6]: long_series.head()Out[6]: 0  0.2294531  0.3044182  0.7361353  -0.8596314  -0.424100dtype: float64In [7]: long_series.tail(3)Out[7]: 997  -0.351587998  1.136249999  -0.448789dtype: float64`

Attributes and the raw ndarray(s)

pandas objects have a number of attributes enabling you to access the metadata

• shape: gives the axis dimensions of the object, consistent with ndarray
• Axis labels
• Seriesindex (only axis)
• DataFrameindex (rows) and columns
• Panelitemsmajor_axis, and minor_axis

Note, these attributes can be safely assigned to!

`In [8]: df[:2]Out[8]:     A   B  C2000-01-01 0.048869 -1.360687 -0.479012000-01-02 -0.859661 -0.231595 -0.52775In [9]: df.columns = [x.lower() for x in df.columns]In [10]: dfOut[10]:     a   b   c2000-01-01 0.048869 -1.360687 -0.4790102000-01-02 -0.859661 -0.231595 -0.5277502000-01-03 -1.296337 0.150680 0.1238362000-01-04 0.571764 1.555563 -0.8237612000-01-05 0.535420 -1.032853 1.4697252000-01-06 1.304124 1.449735 0.2031092000-01-07 -1.032011 0.969818 -0.9627232000-01-08 1.382083 -0.938794 0.669142`