Python + SAS work in Tandem with SASPY



So you have been SAS aficionado and you have recently learnt python as well but are sick of switching between SAS and Python and using .csv files as intermediary data transfer. How do you do all SAS'y stuff in Python??



Well the solution to this is using SASPY which is a package used in tandem with Jupyter Notebook. I managed to configure SASPY to work with SAS GRID and local enterprise guide installation. Basically SASPY whips up a EG session (given EG is installed locally) connects to a SAS GRID and using SAS MAGIC (they are magical functions) execute SAS code directly on SAS grid.

The best part is you can retrieve the data from SAS grid and transfer it to a pandas data frame. That's true sas7bdat to pandas data frame, whilst using SAS data steps (more efficient.......sometimes) and then processing the data for text analysis(for example) as SAS enterprise license and text miner on top of it is really expensive!! Plus you cannot do fancy stuff like LDA on SAS text miner and brag about it!!(joking)


Below are few simple steps to make this work, I have illustrated an example below which shows how to connect from locally installed Jupyter Notebook to SAS GRID. One can use SASPY to also use a local copy of SAS and use libname syntax to log on to databases and execute SAS data steps/code.

Step 1.

Download the saspy package

#If installing through anaconda
conda install saspy
#If installing through pypi
pip install saspy

Step 2.

If you want to connect to SAS Grid through your locally installed Enterprise Guide then the next step is to amend a file named sascfg.py.

This file is located in site packages folder within Anaconda installation for e.g.

C:\Program Data\Anaconda\Lib\site-packages\saspy
OR
C:\Program Files\Anaconda\Lib\site-packages\saspy

Step 3.

Open the sascfg.py preferably in notepad++ and edit it as follows. I have also provided a complete snippet of the sascfg file below and editing instructions in "purple" and the code updated in "pink-red".

a) The configured options used to connect with SAS GRID , which also have been detailed in the code snippet below as well.

First is to update the sas configuration, initially it will be pointing at "default". this can be updated a per below

SAS_config_names=['default','winlocal','winiomwin','winiomlinux','winiomIWA','iomcom']

b) Update the configuration for winlocal , winiomwin, winiomlinux, winiomIWA and iomcom.

There are three pieces of information which need to be derived and I have illustrated how to derive them below.

HOST NAME & PORT: In SAS EG right click the server you want to connect (if grid there would be multiple servers).


After right clicking go to properties and you would be able to see host name and port.


CLASS ID: This information can be derived by running the following code:

proc iomoperate;
   list types;
quit;

Once above code has been run look for "Class identifier" for "SAS METADATA SERVER".

Use the alphanumeric combination (note: use uppercase for characters e.g. "e" becomes "E").


Step 4.

Finally using jupyter notebook to conenct to SAS grid!!

Import saspy and pywin (necessary if you are using iom methods to form the connection).

After importing necessary packages start a SASsession. If how is the query then the answer is in the screenshots below:


a) IMPORT Packages

b) LAUNCH Session: which will ask for you user id


c) ENTER User id and password


d) IF you remember them correctly following should happen!


e) Do disconnect as you will be using a licence instance of your SAS and if you lookc too many in be ready to get that cold tap on the shoulder from your admin.


So disconnect after each session

sas.disconnect()
Out[10]: 'Succesfully disconnected. Be sure to have a valid network connection before submitting anything else.'
How to set up sascfg files and other small details shown below!!!
 

After deriving the above information the code which needs to be updated is as follows:

So lets Code!!!!

Complete Code below edited code and comments in "pink-red" and "purple":

SASCFG.PY :
#
# Copyright SAS Institute
#
#  Licensed under the Apache License, Version 2.0 (the License);
#  you may not use this file except in compliance with the License.
#  You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
#  Unless required by applicable law or agreed to in writing, software
#  distributed under the License is distributed on an "AS IS" BASIS,
#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#  See the License for the specific language governing permissions and
#  limitations under the License.
#

# THIS IS AN EXAMPLE CONFIG FILE. PLEASE CREATE YOUR OWN sascfg_personal.py FILE USING THE APPROPRIATE TEMPLATES FROM BELOW
# SEE THE CONFIGURATION DOC AT https://sassoftware.github.io/saspy/install.html#configuration


# Configuration Names for SAS - python List
# This is the list of allowed configuration definitions that can be used. The definition are defined below.
# if there is more than one name in the list, and cfgname= is not specified in SASsession(), then the user
# will be prompted to choose which configuration to use.
#
# The various options for the different access methods can be specified on the SASsession() i.e.:
# sas = SASsession(cfgname='default', options='-fullstimer', user='me')
#
# Based upon the lock_down configuration option below, you may or may not be able to override option
# that are defined already. Any necessary option (like user, pw for IOM or HTTP) that are not defined will be 
# prompted for at run time. To disallow overrides of as OPTION, when you don't have a value, simply
# specify options=''. This way it's specified so it can't be overridden, even though you don't have any
# specific value you want applied.
# 
#SAS_config_names = ['default', 'ssh', 'iomlinux', 'iomwin', 'winlocal', 'winiomlinux', 'winiomwin', 'httpsviya', 'httpviya', 'iomcom']
#
# NEEDS CHANGE OF DEFAULT VALUE. IN THE FIRST INSTANCE YOU WILL ONLY SEE 'default' AND YOU HAVE TO MANUALLY ADD OTHER CONFIG_NAMES WHEN YOU OPEN THIS FILE FIRST

SAS_config_names=['default','winlocal','winiomwin','winiomlinux','winiomIWA','iomcom']



# Configuration options for saspy - python Dict   # not required unless changing any of the defaults
# valid key are:
# 
# 'lock_down' - True | False. True = Prevent runtime overrides of SAS_Config values below
#
# 'verbose'   - True | False. True = Allow print statements for debug type messages
#
# 'prompt'    - True | False. True = Allow prompting as necessary
#
SAS_config_options = {'lock_down': False,
                      'verbose'  : True,
                      'prompt'   : True
                     }



# Configuration options for SAS output. By default output is HTML 5.0 (using "ods html5" statement) but certain templates might not work 
# properly with HTML 5.0 so it can also be set to HTML 4.0 instead (using "ods html" statement). This option will only work when using IOM
# in local mode. Note that HTML 4.0 will generate images separately which clutters the workspace and if you download the notebook as HTML, 
# the HTML file will need to be put in the same folder as the images for them to appear.
# valid key are:
# 
# 'output' = ['html5', 'html']
#
SAS_output_options = {'output' : 'html5'}       # not required unless changing any of the default



# Configuration Definitions
#
# For STDIO and STDIO over SSH access methods
# These need path to SASHome and optional startup options - python Dict
# The default path to the sas start up script is: /opt/sasinside/SASHome/SASFoundation/9.4/sas
# A usual install path is: /opt/sasinside/SASHome
#
# The encoding is figured out by saspy. You don't need to specify it, unless you just want to get rid of the message about which encoding was determined.
#                                                                                                         
# valid keys are:
# 'saspath'  - [REQUIRED] path to SAS startup script i.e.: /opt/sasinside/SASHome/SASFoundation/9.4/sas
# 'options'  - SAS options to include in the start up command line - Python List
# 'encoding' - This is the python encoding value that matches the SAS session encoding your SAS session is using 
#
# For password less ssh connection, the following are also required:
# 'ssh'     - [REQUIRED] the ssh command to run
# 'host'    - [REQUIRED] the host to connect to
#
# Additional valid keys for ssh:
# 'port'    - [integer] the remote ssh port
# 'tunnel'  - [integer] local port to open via reverse tunnel, if remote host cannot otherwise reach this client
#
default  = {'saspath'  : '/opt/sasinside/SASHome/SASFoundation/9.4/bin/sas_u8'
            }

ssh      = {'saspath' : '/opt/sasinside/SASHome/SASFoundation/9.4/bin/sas_en',
            'ssh'     : '/usr/bin/ssh',
            'host'    : 'remote.linux.host', 
            'encoding': 'latin1',
            'options' : ["-fullstimer"]
            }


# For IOM (Grid Manager or any IOM) and Local Windows via IOM access method
# These configuration definitions are for connecting over IOM. This is designed to be used to connect to any Workspace server, including SAS Grid, via Grid Manager
# and also to connect to a local Windows SAS session. The client side (python and java) for this access method can be either Linux or Windows.
# The STDIO access method above is only for Linux. PC SAS requires this IOM interface.
#
# The absence of the iomhost option triggers local Windows SAS mode. In this case none of 'iomhost', 'iomport', 'omruser', 'omrpw' are needed.
# a local SAS session is started up and connected to.
#
# The encoding is figured out by saspy. You don't need to specify it, unless you just want to get rid of the message about which encoding was determined.

# NONE OF THE PATHS IN THESE EAMPLES ARE RIGHT FOR YOUT INSTALL. YOU HAVE TO CHANGE THE PATHS TO BE CORRECT FOR YOUR INSTALLATION 
#
# valid keys are:
# 'java'      - [REQUIRED] the path to the java executable to use
# 'iomhost'   - [REQUIRED for remote IOM case, Don't specify to use a local Windows Session] the resolvable host name, or ip to the IOM server to connect to
# 'iomport'   - [REQUIRED for remote IOM case, Don't specify to use a local Windows Session] the port IOM is listening on
# 'authkey'   - identifier for user/password credentials to read from .authinfo file. Eliminates prompting for credentials.
# 'omruser'   - not suggested        [REQUIRED for remote IOM case but PROMPTED for at runtime] Don't specify to use a local Windows Session
# 'omrpw'     - really not suggested [REQUIRED for remote IOM case but PROMPTED for at runtime] Don't specify to use a local Windows Session
# 'encoding'  - This is the python encoding value that matches the SAS session encoding of the IOM server you are connecting to
# 'appserver' - name of physical workspace server (when more than one app server defined in OMR) i.e.: 'SASApp - Workspace Server'
# 'sspi'      - boolean. use IWA instead of user/pw to connect to the IOM workspace server


iomlinux = {'java'      : '/usr/bin/java',
            'iomhost'   : 'linux.iom.host',
            'iomport'   : 8591,
            }           

iomwin   = {'java'      : '/usr/bin/java',
            'iomhost'   : 'windows.iom.host',
            'iomport'   : 8591,
            }
# TO MAKE THIS WORK YOU NEED TO PROVIDE THE JAVA LOCATION FROM YOUR SAS HOME FOLDER> THE LOCATION OF THE FOLDER MAY VARY SO YOU MIGHT HAVE TO LOOK AROUND
#IOM HOST: THIS IS THE HOST ADDRESS OF THE SAS SERVER YOU WANT TO LOG INTO. THIS CAN BE FOUND BY RIGHT CLICKING THE SERVER IN ENTERPRISE GUIDE.
#IOM PORT:THIS IS THE HOST ADDRESS OF THE SAS SERVER YOU WANT TO LOG INTO. THIS CAN BE FOUND BY RIGHT CLICKING THE SERVER IN ENTERPRISE GUIDE.
# winlocal is to access local windows SAS instance
# winimolinux is to connect to a remote server which is linux and local 
  machine is windows
# winimwin is to connect a windows to a windows server
# Windows client and with IWA to Remote IOM server

winlocal = {'java'      : 'C:\\SAS\\SASHome\\SASPrivateJavaRuntimeEnvironment\\9.4\\jre\\bin\\java',
            'encoding'  : 'windows-1252',
            }

winiomlinux = {'java'   : 'C:\\SAS\\SASHome\\SASPrivateJavaRuntimeEnvironment\\9.4\\jre\\bin\\java',
            'iomhost'   : 'host_address_sasgrid.com.au',
            'iomport'   : 8591,
            }

winiomwin  = {'java'    : 'java',
            'iomhost'   : 'host_address_sasgrid.com.au',
            'iomport'   : 8591,
            }

winiomIWA  = {'java'    : 'C:\\SAS\\SASHome\\SASPrivateJavaRuntimeEnvironment\\9.4\\jre\\bin\\java',
            'iomhost'   : 'host_address_sasgrid.com.au',
            'iomport'   : 8591,
            'sspi'      : True
            }


# For Remote and Local IOM access methods using COM interface
# These configuration definitions are for connecting over IOM using COM. This
# access method is for Windows clients connecting to remote hosts. Local
# SAS instances may also be supported.
#
# This access method does not require a Java dependency.
#
# Valid Keys:
#   iomhost     - Required for remote connections only. The Resolvable SAS
#                 server dns name.
#   iomport     - Required for remote connections only. The SAS workspace
#                 server port. Generally 8591 on standard remote
#                 installations. For local connections, 0 is the default.
#   class_id    - Required for remote connections only. The IOM workspace
#                 server class identifier. Use `PROC IOMOPERATE` to identify
#                 the correct value. This option is ignored on local connections.
#   provider    - [REQUIRED] IOM provider. "sas.iomprovider" is recommended.
#   encoding    - This is the python encoding value that matches the SAS
#                 session encoding of the IOM server.
#   omruser     - SAS user. This option is ignored on local connections.
#   omrpw       - SAS password. This option is ignored on local connections.
#   authkey     - Identifier for credentials to read from .authinfo file.

# windows com to sas IOM this method is not java dependant though require pywin32 package to be installed and imported when calling saspy

iomcom = {
    'iomhost' : 'host_address_sasgrid.com.au',
    'iomport' : 8591,
    'class_id': '043396D4-10F0-11D0-9F41-99A024FF830C',
    'provider': 'sas.iomprovider',
    'encoding': 'utf-8'}


# HTTP access method to connect to the Compute Service
# These need ip addr, other values will be prompted for - python Dict
# valid keys are:
# 'url'     - (Required if ip not specified) The URL to Viya, of the form "http[s]://host.idenifier[:port]". 
#             When this is specified, ip= will not be used, as the host's ip is retrieved from the url. Also, ssl= is 
#             set based upon http or https and port= is also parsed from the url, if provided, else defaulted based 
#             upon the derived ssl= value. So neither ip, port nor ssl are needed when url= is used.
# 'ip'      - (Required if url not specified) The resolvable host name, or IP address to the Viya Compute Service
# 'port'    - port; the code Defaults this to based upon the 'ssl' key; 443 default else 80
# 'ssl'     - whether to use HTTPS or just HTTP protocal. Default is True, using ssl and poort 443
# 'context' - context name defined on the compute service  [PROMTED for at runtime if more than one defined]
# 'authkey' - identifier for user/password credentials to read from .authinfo file. Eliminates prompting for credentials.
# 'options' - SAS options to include (no '-' (dashes), just option names and values)
# 'user'    - not suggested [REQUIRED but PROMTED for at runtime]
# 'pw'      - really not suggested [REQUIRED but PROMTED for at runtime]
# 
#
             
httpsviya = {'ip'      : 'sastpw.rndk8s.openstack.sas.com',
             'context' : 'Data Mining compute context',
             'authkey' : 'viya_user-pw',
             'options' : ["fullstimer", "memsize=1G"]
             }

httpviya = {'ip'      : 'sastpw.rndk8s.openstack.sas.com',
            'ssl'     : False,  # this will use port 80
            'context' : 'Data Mining compute context',
            'authkey' : 'viya_user-pw',
            'options' : ["fullstimer", "memsize=1G"]
            }

You can replace the above sascfg.py file or create a sascfg_personal.py file and save it in the same folder. The package first tries to reference sascfg_personal.py file if not then sascfg.py.


Below are few basic codes which help you get started. Remember SAS magic code "%%SAS" can only be written at the start of a new cell and not in the middle after other code.


Example code to reference a code within SAS GRID

Path of the lib can be known by right-clicking and viewing the properties in enterprise guide.

sas.saslib('my_lib', path="/path_lib/right_click/properties_of_lib")

Example of SAS MAGIC CODE ( note: remember always write this code at the start of a new cell).


%%SAS sas

proc sql;
create table test as
    select mytable.*
        
    from mylib.testform as mytable;
quit; 

 

ADDITIONAL INFORMATION:

If for some reason the above configuration doesn't work I have also shown a old sasfy.py configures files below. In the previous version you had to define various class paths for various jars.


OLD SASCFG.py configuration


#
# Copyright SAS Institute
#
#  Licensed under the Apache License, Version 2.0 (the License);
#  you may not use this file except in compliance with the License.
#  You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
#  Unless required by applicable law or agreed to in writing, software
#  distributed under the License is distributed on an "AS IS" BASIS,
#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#  See the License for the specific language governing permissions and
#  limitations under the License.
#

# THIS IS AN EXAMPLE CONFIG FILE. PLEASE CREATE YOUR OWN sascfg_personal.py FILE USING THE APPROPRIATE TEMPLATES FROM BELOW
# SEE THE CONFIGURATION DOC AT https://sassoftware.github.io/saspy/install.html#configuration


# Configuration Names for SAS - python List
# This is the list of allowed configuration definitions that can be used. The definition are defined below.
# if there is more than one name in the list, and cfgname= is not specified in SASsession(), then the user
# will be prompted to choose which configuration to use.
#
# The various options for the different access methods can be specified on the SASsession() i.e.:
# sas = SASsession(cfgname='default', options='-fullstimer', user='me')
#
# Based upon the lock_down configuration option below, you may or may not be able to override option
# that are defined already. Any necessary option (like user, pw for IOM or HTTP) that are not defined will be 
# prompted for at run time. To dissallow overrides of as OPTION, when you don't have a value, simply
# specify options=''. This way it's specified so it can't be overridden, even though you don't have any
# specific value you want applied.
# 
#SAS_config_names = ['default', 'ssh', 'iomlinux', 'iomwin', 'winlocal', 'winiomlinux', 'winiomwin', 'httpsviya', 'httpviya', 'iomcom']
##
# NEEDS CHANGE OF DEFAULT VALUE. IN THE FIRST INSTANCE YOU WILL ONLY SEE 'default' AND YOU HAVE TO MANUALLY ADD OTHER CONFIG_NAMES WHEN YOU OPEN THIS FILE FIRST

SAS_config_names=['default','winlocal','winiomwin','winiomlinux','winiomIWA','iomcom']

# Configuration options for saspy - python Dict
# valid key are:
# 
# 'lock_down' - True | False. True = Prevent runtime overrides of SAS_Config values below
#
# 'verbose'   - True | False. True = Allow print statements for debug type messages
#
SAS_config_options = {</