{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Python Collections, Numpy Arrays, and Slicing\n", "\n", "Below, we explore some basic Python programs for demonstrating use of Python's built-in data types, focusing on **collections** and `numpy` **arrays**. We also introduce the concept of **slicing**, which allows us to quickly and cleanly extract subset of sequence and array types.\n", "\n", "\n", "## Collections\n", "\n", "Python has built-in support for \"collection\" data types - these are complex data types that hold other collections of other data. Python provides three basic collection types:\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
TypeDescriptionExamples
StringCollection of characters, indicated by single or double quotes\n", "
  • \"this is a string\"
  • 'this is also a string'
  • 'ABC-123'
\n", "
ListCollection of 'mutable' objects (they can be changed), indicated by brackets ([ ])
  • [1,2,3]
  • [ 'a', 32, 14.7 ]'
  • [ 'A','list','of','strings' ]
  • [\"a list containing\", [ \"another list\" ] ]
TupleCollection of 'unmutable' objects (they can NOT be changed), indicated by parentheses (( ))
  • (1,2,3)
  • ( 'a', 32, 14.7 )'
  • [ 'A','tuple','of','strings' ]
  • [\"a tuple containing\", ( \"another tuple\" ) )
DictionaryCollection of key:value pairs, where keys are used to look up values, indicated by braces ({ })
  • {'name':'John', 'age':56}
\n", "\n", "Additionally, **numpy** provides array and matrix types that we will use extensively." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Making Collections\n", "\n", "Collections can be created in multiple ways..." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "string: this is a string\n", "list: [1, 2, 3, 4, 5]\n", "tuple: ('this is a tuple', 4, 'a tuple within a tuple')\n", "np.zeros(): [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]\n", "np.ones(): [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]\n", "np.arange(): [0. 0.5 1. 1.5 2. 2.5 3. 3.5 4. 4.5]\n", "np.linspace(): [0. 0.5 1. 1.5 2. 2.5 3. 3.5 4. 4.5 5. ]\n", "2x3 matrix:\n", " [[0 0 0]\n", " [0 0 0]]\n", "dictionary: {'name': 'John', 'age': 10}\n" ] } ], "source": [ "import numpy as np\n", "\n", "# make a string\n", "mystr = \"this is a string\"\n", "print(\"string:\", mystr)# make a list\n", "\n", "# make a list\n", "mylist = [1,2,3,4,5]\n", "print(\"list:\", mylist)\n", "\n", "# make a tuple\n", "mytuple = (\"this is a tuple\", 4, (\"a tuple within a tuple\"))\n", "print(\"tuple:\", mytuple)\n", "\n", "# make a few numpy arrays\n", "mynp_zeros = np.zeros(10) # array of ten zeros\n", "print(\"np.zeros():\", mynp_zeros)\n", "mynp_ones = np.ones(10) # array of ten ones\n", "print(\"np.ones():\", mynp_ones)\n", "mynp_range = np.arange(0,5,0.5) # range of values, starting at 0, ending at 4.5, in increments of 0.5\n", "print(\"np.arange():\", mynp_range)\n", "mynp_range2 = np.linspace(0,5,11) # evenly spaced range of 10 values, starting at 0, ending at 5\n", "print(\"np.linspace():\", mynp_range2)\n", "mynp_matrix = np.zeros((2,3), dtype=int) # 2D matrix of integers, filled with zeros\n", "print(\"2x3 matrix:\\n\", mynp_matrix)\n", "\n", "mydictionary = {'name':'John', 'age':10 } # dictionary (key:value pairs\n", "print(\"dictionary:\", mydictionary)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Counting, appending, inserting, deleting items in a List" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "length of a: 6\n", "After +: [1, 2, 3, 4, 5, 6, 7, 8]\n", "After append(): [1, 2, 3, 4, 5, 6, 7, 8, 10]\n", "After insert(): [1, 2, 3, 4, 5, 0, 6, 7, 8, 10]\n", "After remove(): [1, 2, 3, 4, 5, 6, 7, 8, 10]\n", "After del(): [1, 2, 4, 5, 6, 7, 8, 10]\n", "After clear(): []\n" ] } ], "source": [ "a = [1,2,3,4,5,6] # 'a' is a list, but it works pretty much the same with numpy arrays\n", "\n", "print( \"length of a: {}\".format(len(a))) # len(collection) returns the count of items in the collection\n", "\n", "# append list\n", "a = a + [7,8] # '+' returns a new object, so we have to assign a variable to it\n", "print(\"After +: \", a)\n", "\n", "# append single item \n", "a.append(10) # .append() operates on an existing object, so no assignment necessary\n", "print(\"After append(): \", a)\n", "\n", "# insert an item (0) at a specific location (index 5) - remember indexes are zero-based\n", "a.insert(5,0)\n", "print(\"After insert():\", a)\n", "\n", "# remove a particular item in the collection\n", "a.remove(0)\n", "print(\"After remove():\", a)\n", "\n", "# delete an item at a particular index location\n", "del(a[2])\n", "print(\"After del():\", a)\n", "\n", "# clear out an array\n", "a.clear() # or a=[]\n", "print(\"After clear():\", a)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Counting, appending, inserting, deleting, reshaping items in a numpy array\n", "\n", "These are but a few of the ways - see the numpy docs at https://numpy.org for more information." ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "length of a: 6\n", "After +: [1 2 3 4 5 6 7 8]\n", "After append(): [ 1 2 3 4 5 6 7 8 10]\n", "After insert(): [ 1 2 3 4 5 0 6 7 8 10]\n", "After delete(): [ 1 2 4 5 0 6 7 8 10]\n", "After reshape(): [[ 1 2 4]\n", " [ 5 0 6]\n", " [ 7 8 10]]\n" ] } ], "source": [ "import numpy as np\n", "\n", "a = np.array([1,2,3,4,5,6]) # 'a' now a numpy array\n", "\n", "print( \"length of a: {}\".format(len(a))) # len(collection) returns the count of items in the collection\n", "\n", "# append list\n", "a = np.concatenate((a,[7,8])) # returns a new object, so we have to assign a variable to it\n", "print(\"After +: \", a)\n", "\n", "# append single item \n", "a = np.append(a,[10]) # np.append() returns a new oject, assignment necessary\n", "print(\"After append(): \", a)\n", "\n", "# insert an item (0) at a specific location (index 5) - remember indexes are zero-based\n", "a = np.insert(a,5,0)\n", "print(\"After insert():\", a)\n", "\n", "# delete an item at a particular index location\n", "a = np.delete(a,2)\n", "print(\"After delete():\", a)\n", "\n", "# reshape an array (returns new array (or view))\n", "a = a.reshape((3,3))\n", "print(\"After reshape():\", a)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Dictionaries\n", "\n", "Dictionaries are a slightly different, but often useful, collection type. Instead of accessing elements by location (index), we access them by providing a **key**, which can be a string, number or about anything else. Dictionaries are useful when it is convenient to store information by keyword lookup.\n", "\n", "In the example below, we use a dictionary to store information about an satellite image, but the concept can be extended to about anything." ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'location': 'Oregon', 'center': (45.7, 132.34), 'width': 800, 'height': 800, 'date': '12/12/2109'}\n", "length: 5\n", "(45.7, 132.34)\n", "12/12/2109\n" ] } ], "source": [ "# make an empty dictionary\n", "imageInfo = {} # 'a' is now an empty dictionary\n", "\n", "# add some key:value pairs to \n", "imageInfo['location'] = 'Oregon'\n", "imageInfo['center'] = (45.7,132.34)\n", "imageInfo['width'] = 800\n", "imageInfo['height'] = 800\n", "imageInfo['date'] = '12/12/2109'\n", "\n", "print(imageInfo)\n", "print( \"length: {}\".format(len(imageInfo))) # len(collection) returns the count of items in the collection\n", "\n", "# use key to 'look up' elements\n", "print(imageInfo['center'])\n", "print(imageInfo['date'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Accessing items in a Collection\n", "\n", "For all of the Collections described above, accessing items in the collection is accomplished using the **[ ]** operators as follows (remember, indexes are zero-based): " ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "3\n", "3\n", "3\n", "8\n", "value2\n" ] } ], "source": [ "import numpy as np\n", "\n", "mylist = [1,2,3,4]\n", "mytuple = (1,2,3,4)\n", "myarray = np.array([1,2,3,4])\n", "mymatrix = np.arange(0,9).reshape(3,3)\n", "mydict = { \"key1\":\"value1\", \"key2\":\"value2\" }\n", "\n", "print(mylist[2])\n", "print(mytuple[2])\n", "print(myarray[2])\n", "print(mymatrix[2,2])\n", "print(mydict['key2'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Slicing\n", "\n", "Slicing is an extremely powerful and useful way to extract subsets of elements from list, tuples, numpy arrays, and numpy matrices, and is supported by many additional libraries as well. Slicing builds on the bracket accessor operator (**[ ]**) notation, and is best demonstrated by example: " ] }, { "cell_type": "code", "execution_count": 69, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "mylist: [0, 1, 2, 3, 4]\n", "myarray: [0 1 2 3 4 5 6 7 8]\n", "mymatrix:\n", " [[0 1 2]\n", " [3 4 5]\n", " [6 7 8]]\n", "mylist[1:3]: [1, 2]\n", "myarray[1:3]: [1 2]\n", "mymatrix[0:2,1:3]:\n", " [[1 2]\n", " [4 5]]\n", "mylist[:3]: [0, 1, 2]\n", "myarray[2:]: [2 3 4 5 6 7 8]\n", "mymatrix[:2,2:]:\n", " [[2]\n", " [5]]\n", "myarray[:]: [0 1 2 3 4 5 6 7 8]\n", "mymatrix[:,0:2]:\n", " [[0 1]\n", " [3 4]\n", " [6 7]]\n", "mymatrix[1,:]:\n", " [[3 4 5]]\n" ] } ], "source": [ "import numpy as np\n", "\n", "mylist = [0,1,2,3,4]\n", "myarray = np.array([0,1,2,3,4,5,6,7,8])\n", "mymatrix = mymatrix.reshape(3,3)\n", "print(\"mylist:\", mylist)\n", "print(\"myarray:\", myarray)\n", "print(\"mymatrix:\\n\", mymatrix)\n", "\n", "# get a subset - [start:end] NOTE: up to BUT NOT INCLUDING end!\n", "print( \"mylist[1:3]:\", mylist[1:3]) # start at index 1, end at index 2\n", "print( \"myarray[1:3]:\", myarray[1:3])\n", "\n", "# for 2D matrices, we need to specify row and columns\n", "print( \"mymatrix[0:2,1:3]:\\n\", mymatrix[0:2,1:3]) # rows 1 and 2, column 0\n", "\n", "# you can \"default\" to start or end\n", "print( \"mylist[:3]:\", mylist[:3]) # start at beginning, end at index 2\n", "print( \"myarray[2:]:\", myarray[2:]) # start at index 2, end at end\n", "print( \"mymatrix[:2,2:]:\\n\", mymatrix[:2,2:]) # rows 0 and 1, last oolumn\n", "\n", "# ':'' by itself is \"all\"\n", "print( \"myarray[:]:\", myarray[:]) # all items\n", "print( \"mymatrix[:,0:2]:\\n\", mymatrix[:,0:2]) # all rows, columns 0-1\n", "print( \"mymatrix[1,:]:\\n\", mymatrix[1:2,:]) # row 1, all columns," ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.8" }, "varInspector": { "cols": { "lenName": 16, "lenType": 16, "lenVar": 40 }, "kernels_config": { "python": { "delete_cmd_postfix": "", "delete_cmd_prefix": "del ", "library": "var_list.py", "varRefreshCmd": "print(var_dic_list())" }, "r": { "delete_cmd_postfix": ") ", "delete_cmd_prefix": "rm(", "library": "var_list.r", "varRefreshCmd": "cat(var_dic_list()) " } }, "types_to_exclude": [ "module", "function", "builtin_function_or_method", "instance", "_Feature" ], "window_display": false } }, "nbformat": 4, "nbformat_minor": 1 }