Python
Data Types and Structures
Programming languages store and process data of various types as variables, and are defined by their type. Datatypes dictate how the data is represented and what operations can be performed with the data. One method by which a datatype can be assigned to a variable is through casting, which is achieved using the appropriate constructor.
If a variable's data type is unknown, Python offers the following functionality to return the variable's type:
|
type(<variable>) |
Python's built-in data types are broken out into five categories: Numeric, Boolean, Sequence, Mapping, and Sets. Generally, data types define the kind of data stored in a variable, and data structures define how the data is organized and stored. In other words, data structures are collections of data types.
These types are detailed below, along with some examples and useful Python methods that facilitate processing data of various types and structures. Practice implementing various data types through these exercises.

None
·
None: null value or null object, represents the absence of an
object
o NOT the same as False, 0, or
an empty variable
o Used as a default value when
a true value does not exist or is yet to be defined
o None is a data type of the class NoneType object
§ Example: None
Numeric
·
Integer: Numeric, Whole
·
Float: Numeric, Decimal
Boolean
·
Boolean: Binary, False/True, 0/1
o True or False values, also
represented as 1 or 0, respectively. This is particularly useful for comparison
operations, where populated variables or data structures are True, and
otherwise False.
§ Constructor: bool(<value>)
§ Example: True
§ Example: 1
§ Example: False
§ Example: 0
o True is the same as 1, and
False is the same as 0
o bool(“”) yields False
o bool(“<char>”) yields
True
Sequence
·
String: Text, sequence of characters
String Methods
|
<upper_cased_string_value> = <string_ value>.upper() |
Returns <upper_cased_string_value> with all <string_value> characters capitalized |
|
<lower_cased_string_value> = <string_ value>.lower() |
Returns <lower_cased_string_value> with all <string_value> characters lower-cased |
|
<title_string_value> = <string_value>.title() |
Returns <title_string_value> with the first character of each
word in <string_value> capitalized |
|
<stripped_string_value> = <string_value>.strip(<char>) |
Returns <stripped_string_value> with leading and trailing
space, or otherwise indicated, characters removed from <string_value>. Leading and trailing space is the
default character, including \t and \n |
|
<split_string_value> = <string_value>.split(<delimiter>) |
Returns <split_string_value> with the original <string_value> split at the defined delimiter, if
defined. No defined delimiter splits the string at the white spaces |
|
<replaced_string_value> = <string_value>.replace(<current_char>, <new_char>,
<index>) |
Returns <string_value> with all instances of the <current_char> in <string_value>,
replaced with <new_char>. <index>
is optional and indicates the index of the substring to replace. |
Strings can also
contain escape characters, which generally represent illegal characters
in a string. For example, double quotes are considered illegal characters
because Python already recognizes single and double quotes as string syntax.
Alternatively, a user can insert double quotes in a string by preceding that
piece of the text with a backslash:
|
"My message to this planet we call
\"Earth\" is Hello!" |
Escape characters can also define how a string is spaced and
distributed. These escape characters are placed directly in the text where the
user wants a particular spacing or distribution to occur. Some escape
characters include:
String Escape Characters
|
“ \n ” |
new line |
|
“ \t ” |
tab |
|
“ \\ ” |
backslash |
|
“ \’ ” |
single quote, apostrophe |
|
“ \” “ |
double quote |
Dynamic Typing formats strings that contain variables that
are not explicitly defined. These variables may be defined programmatically or
as user inputs, and can be represented in a string
with curly braces, {}. Examples of how this is achieved are shown below:
**A user is prompted
to input the <current_month> and <current_day>
as string values:
% is the placeholder for the variable in the
string and 's' indicates that the variable is a string type. In this instance,
we have multiple string inputs so the sequence in which the variables are
defined determines the sequence in which they occur in the string
|
'Today\'s date is %s/%s' % (<current_month>,
<current_day>) |
Empty string variables are populated
sequentially by the format variable's index
|
'Today\'s date is {}/{}'.format(<current_month>, <current_day>) |
String variables containing integers are
defined by the pertaining format variable's index
|
'Today\'s date is {0}/{1}'.format(<current_month>, <current_day>)
|
OR
|
'Today\'s date is {1}/{0}'.format(<current_day>, <current_month>) |
String variables containing variable names
are defined by format variable's assigned value
|
'Today\'s date is {month}/{day}'.format(month=<current_month>, day=<current_day>) |
The 'f' is used to format the string literal directly (introduced in Python 3)
|
f'Today\'s
date is {<current_month>}/{<current_day>}' |
If a float type is
being inserted into a string, it can be follow the methods above or be further formatted using the
methods below:
** A user is
prompted to input a price:
The price is inserted into a string and
formatted to show 2 decimal places
|
'The price is {:.2f}'.format(price) |
The price is inserted into a string literal and formatted to show 3 decimal places
|
f'The price
is {price:.3f}' |
Data
Structures
· Data structures are containers within which
data can be stored. Data within data structures can be standalone data of the
types defined above, or other data structures through a process known as nesting
· Terminology
o
Ordered: Container retains the order of the values
o
Mutable: Container values can be changed
o
Heterogeneous: Container multiple more than one data type
o
Duplicates: Container allows repeated values
· The four data structures to know are Lists, Tuples, Dictionaries, and Sets.
·
List: An ordered sequence structure for storing a collection of
items in a single variable
o It is one of the most
versatile and frequently used data types in the language
o Syntax: [<val_0>,
<val_1>]
o Properties:
§ Ordered: YES
§ Mutable: YES
§ Heterogeneous: YES
§ Duplicates: YES
§ Indexable: YES (integer)
List Methods
|
list() |
Constructor |
|
<list_obj_length> = len(<list_obj>) |
Get length of <list_obj>, which is the number of objects in <list_obj> |
|
<value_count> = <list_obj>.count(<value>) |
Count how many
times a specific <value> appears in <list_obj> |
|
<value
> = <list_obj>[index] |
Index <list_obj> to access the value in the indicated <index>
position |
|
<index>
= <list_obj>.index(<value>) |
Get the index of
a <value> that exists in <list_obj> |
|
<list_obj>.insert(<index>,
<value>) |
Inserts <value>
into the indicated index in <list_obj>,
in-place |
|
<list_obj>.remove(<value>) |
Removes one
(first as it appears) instance of <value> from <list_obj>,
in-place |
|
<list_obj>.sort(reverse=<boolean>) |
Sorts
<list_obj> values, in-place reverse=True:
ascending | reverse=False: descending |
|
<new_list_obj> = sorted(<list_obj>,
reverse=<boolean>) |
Creates <new_list_obj> of <list_obj>
values sorted reverse=True:
ascending (default)
| reverse=False:
descending |
|
<popped_val> = <list_obj>.pop(<index>) |
Removes the value
in the indicated <index> in <list_obj>,
returns the value as <popped_val>, and <list_obj> now exists without the value |
|
<list_obj>.append(<value>) |
Appends <value>
to the end of <list_obj>, in-place |
|
<list_obj>.extend(<other_list>) |
Appends all values
in <other_list> to the end of <list_obj>, in-place |
|
<new_list_obj> = <list_obj_1> +
<list_obj_2> |
Combine
<list_obj_1> and <list_obj_2> into one <new_list_obj> |
·
Tuple: An ordered sequence structure for storing a collection of
items in a single variable
o Syntax: (<val_0>,
<val_1>)
o Properties:
§ Ordered: YES
§ Mutable: NO
§ Heterogeneous: YES
§ Duplicates: YES
§ Indexable: YES (integer)
Tuple Methods
|
tuple() |
Constructor |
|
<tuple_object_length> = len(<tuple_obj>) |
Get <tuple_object_length> of <tuple_obj>,
which is the number of values in <tuple_obj> |
|
<value_count> = <tuple_obj>.count(<value>) |
Count how many
times a specific <value> appears in <tuple_obj> |
|
<value>
= <tuple_obj>[index] |
Index <tuple_obj> to get <value> in the indicated
<index> position |
|
<index>
= <tuple_obj>.index(<value>) |
Get the index of
a <value> that exists in <tuple_obj> |
|
<new_tuple_obj> = <tuple_obj_1> + (tuple_obj_2) |
Combine <tuple_obj_1>
and <tuple_obj_2> into one <new_tuple_obj> A single-value
tuple requires a comma to be recognized as a tuple |
Values can be
appended to tuples. However, given their immutable nature, there is a specific
process by which this is achieved. Tuples can only be concatenated with other
tuples, so the following process satisfies this requirement:
1.
Identify
the primary tuple
|
primary_tuple=(1, 2, 3) |
2.
Identify
the other tuple within which values to append are contained
|
other_tuple=('a', 'b', 'c') |
3.
Perform
an addition of the primary tuple and other tuple, and assign to a variable
|
concat_tuple=primary_tuple + other_tuple |
4.
The new
tuple contains the primary tuple values and the other tuple values
|
concat_tuple=(1, 2, 3, 'a', 'b', 'c') |
This is the only way to append values to an existing tuple. If an individual
value is being appended to an existing tuple, it must first be cast as a tuple,
thereby satisfying the tuples requirement:
|
primary_tuple=(1, 2, 3, 4) concat_tuple=primary_tuple + (4, ) |
|
concat_tuple=(1, 2, 3, 4) |
OR
|
primary_tuple=(1, 2, 3, 4) concat_tuple=primary_tuple + ('a', ) |
|
concat_tuple=(1, 2, 3, 'a') |
Note that the
parentheses when defining a tuple are optional, as Python recognizes when a
tuple is being created:
|
x = 5, 11 |
is the same as
|
x = (5, 11) |
is the same as
|
x = (11, 5) |
The parentheses
indicate how the variables are stored in the collection. When the tuple is a
part of another collection, the parentheses are necessary to define the
structure. For example:
|
y = [5, 11] |
is NOT the same as
|
z = [(5, 11)] |
is NOT the same as
|
a, b = 5, 11 |
In the examples
above, variable 'y' is a list of values, whereas variable 'z' is a list
comprised of a tuple. The variables 'a, b' are assigned integers 5 and 11,
respectively, through what is known as destructuring
or decomposing.
Mapping
·
Dictionary: Also known as a hashmap,
a mapping data structure that stores data in key-value pairs
o Efficiently retrieves stored
data by mapping a key (think of this as an address) to its corresponding value
§ This is significantly more
efficient than sequentially iterating through a data structure index by index
o Syntax: {<key>: <value>}
o Properties:
§ Ordered: Yes 3.7+ (keys will follow
the order listed in the source code)
§ Mutable: YES (keys are immutable,
values are mutable)
§ Heterogeneous: YES (keys can be different
data types)
§ Duplicates: NO (keys must be unique,
values can be repeated)
§ Indexable: YES (keys)
Dictionary Methods
|
dict() |
Constructor |
|
<all_dict_keys> = <dict_obj>.keys() |
Returns a list of all <dict_obj> keys |
|
<all_dict_keys> = list(<dict_obj>) |
Returns a list of all <dict_obj> keys |
|
<all_dict_values> = <dict_obj>.values() |
Returns a list of all <dict_obj> values |
|
<key_valule_pairs> = <dict_obj>.items() |
Returns <dict_obj>
key-value pairs, (key, value), as a list of tuples |
|
<value> =
<dict_obj>[<key>] |
Returns <value> from <dict_obj> at the <key > |
|
<value> =
<dict_obj>.get(<key
>) |
Returns <value> from <dict_obj> at the <key > |
|
<popped_val> = <dict_obj>.pop(<key >) |
Removes
the value at <key > in <dict_obj>, returns
the value as <popped_val>, and <dict_obj> now exists without the key-value pair |
|
<dict_obj>.update({<key
>: <value>}) |
Inserts <value> at <key > in
<dict_obj>, in-place |
|
<dict_obj>.clear() |
Removes all elements from <dict_obj>, in-place |
·
Keys and Values don’t have defined data types, so the
developer gets to define them in the code design
o Keys can be any data type (str,
int, float, bool, None)
§ Keys are unique
§ Keys are immutable
o Values can be any data type (str,
int, float, bool, None) OR data structure (list, tuple, set, dict)
§ There is no limit to how
many data structures you nest, but indexing can become complex when you need to
access the dictionary values
Set
·
Set: Unordered collections of unique elements, and the structure
is designed to test membership and perform mathematical operations, like a Venn
Diagram
o Syntax: {<val_0>, <val_1>}
o Properties:
§ Ordered: NO
§ Mutable: YES
§ Heterogeneous: YES
§ Duplicates: NO
§ Indexable: NO (unordered structure
cannot be indexed)
Set Methods
|
set() |
Constructor |
|
<set_obj>.add(<value>) |
Adds <value> to <set_obj>,
in-place |
|
<set_obj>.update(<other_set_obj>) |
Inserts values from <other_set> into <set_obj>, in-place |
|
<set_obj>.remove(<value>) |
Removes <value> from <set_obj>,
raising an error if <value> does not exist, in-place |
|
<set_obj>.discard(<value>) |
Removes <value> from <set_obj>,
NOT raising an error if <value> does not exist, in-place |
|
<popped_val>
= <set_obj>.pop() |
Removes the value
‘at random’ from <set_obj>, returns the value
as <popped_val>, and <set_obj>
now exists without the value |
|
<set_obj>.clear() |
Removes all values from <set_obj>,
in-place |
Set Value Membership Methods
|
<new_set_obj>
= <set_obj>.union(<other_set_obj>) Returns <new_set_obj>
with ALL values from <set_obj>
and <other_set_obj> |
|
|
<set_obj>.update(<other_set_obj>) <set_obj> |=
<other_set> |
Updates <set_obj>
in-place, keeping ALL values existing in <set_obj>
AND <other_set_obj> |
|
<new_set_obj>
= <set_obj>.difference(<other_set>) Returns <new_set_obj>
with values that exist in <set_obj>
but not in <other_set_obj> |
|
|
<set_obj>. difference_update(<other_set>) <set_obj> -=
<other_set> |
Updates <set_obj>
in-place, keeping only values existing in <set_obj>
BUT NOT in <other_set_obj> |
|
<new_set_obj>
= <set_obj>.symmetric_difference(<other_set>) Returns <new_set_obj>
with values that exist in <set_obj> OR
<other_set_obj>, but not in both |
|
|
<set_obj>.symmetric_difference_update(<other_set>) <set_obj> ^=
<other_set> |
Updates <set_obj>
in-place, keeping only values existing in <set_obj> OR <other_set_obj>,
not both |
|
<new_set_obj>
= <set_obj>.intersection(<other_set>) Returns <new_set_obj>
with only values that exist in <set_obj> AND
<other_set_obj> |
|
|
<set_obj>.intersection_update(<other_set>) <set_obj> &=
<other_set> |
Updates <set_obj> in-place, keeping only values existing in BOTH
<set_obj> AND <other_set_obj> |
NOTE: 1 and True, and 0 and False are considered
the same value in sets, so the Boolean value and equivalent numeric value
cannot exist in the same set
Operators
Assignment Operator
|
= |
assignment
operator, assigns a value to a variable |
Conditional Statements
Code will step through the comparison block from top to bottom and exit
the block once a condition is met, even if the conditions later in the block
have not been checked yet. The order in which a comparison block is built
matters!
If <conditional_statement>:
<action to
execute>
elif <conditional_statement>:
<action to
execute>
elif <conditional_statement>:
<action to
execute>
else:
<action to
execute for all other conditions not captured above>
|
if |
primary
comparison (only one per comparison block) |
|
elif |
alternative
comparison (can be multiple elif statements in a
comparison block) |
|
else |
all other
comparisons (only one per comparison block) |
Comparison Operators
|
== |
determine
equality |
|
!= |
determine
inequality |
|
> |
determine
greater than |
|
>= |
determine
greater than or equal to |
|
< |
determine less
than |
|
<= |
determine less
than or equal to |
Identity Operator
|
is |
determines likeness
between values |
Logical Operators
|
and |
combine
comparison operators, returns 'True' if both conditions are met |
|
or |
combine
comparison operators, returns 'True' if at least one condition is met |
|
not |
negates a boolean or other binary |
Math Operators
|
+ |
Addition |
|
- |
Subtraction |
|
* |
Multiplication |
|
** |
Exponent |
|
/ |
Division |
|
// |
floor division
(performs division and rounds down to the integer) |
|
% |
modulus
(returns the remainder of a division operation) |
Helpful
Resources