Tag: python

Extracting Waste Stream Collection Dates for the Netherlands

Yeah … mostly saving this for the regex search with a start and end flag that spans newlines because I don’t really need to know the date they collect each waste stream in the Netherlands. Although it’s cool that they’ve got five different waste streams to collect.

import requests
import re

strBaseURL = 'https://afvalkalender.waalre.nl/adres/<some component of your address in the Netherlands>' 
iTimeout = 600 
strHeader = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36'}

# Start and end flags for waste stream collection schedule content
START = '<ul id="ophaaldata" class="line">' 
END = '</ul>' 

page = requests.get(strBaseURL, timeout=iTimeout, headers=strHeader)
strContent = page.content
strContent = strContent.decode("utf-8")

result = re.search('{}(.*?){}'.format(START, END), strContent, re.DOTALL)
strCollectionDateSource = result.group(1)

resultWasteStreamData = re.findall('<li>(.*?)</li>', strCollectionDateSource, re.DOTALL)
for strWasteStreamRecord in resultWasteStreamData:
    listWasteStreamRecord = strWasteStreamRecord.split("\n")
    strDate = listWasteStreamRecord[3]
    strWasteType = listWasteStreamRecord[4]
    print("On {}, they collect {}".format(strDate.strip().replace('<i class="date">','').replace('</i>',''), strWasteType.strip().replace('<i>','').replace('</i>','')))

Python f-strings

I’m accustomed to using variables directly in strings — most of my scripting experience was with Bash, Perl, and PHP. It’s something I have to think about when I would write Java or C++ code, and it’s one of the few things that struck me as a step backward in Python. This is basically the advantage that kwargs have over args, except changing a string common where changing a function definition is pretty rare.

I’ve taken to using the arg identifier numbers in the print statement.

>>> first=”Lisa”
>>> last=”Rushworth”
>>> print(“Hi {0}, you are {0} {1}.”.format(first, last))
Hi Lisa, you are Lisa Rushworth.

This allows me to append a new variable to the format argument list and use it anywhere in the string
>>> salutation=”Hello”
>>> print(“{2} {0}, you are {0} {1}.”.format(first, last, salutation))
Hello Lisa, you are Lisa Rushworth.

Python 3 introduced f-strings, which go beyond the functionality I was missing. You *can* use variable names within the string:

>>> print(f”{salutation} {first}, you are {first} {last}”)
Hello Lisa, you are Lisa Rushworth

You can have multi-line strings — notice that newlines aren’t being magically added. If you *want* newlines, add \n!

>>> strString = (
… f”{salutation} {first}, ”
… f”You are {first} {last} ”
… f”And I have more ”
… f”data down here”
… )
>>> print(strString)
Hello Lisa, You are Lisa Rushworth And I have more data down here

But you can also perform operations:
>>> print(f”{2 ** 3}”)
8

And use the f-string in classes
>>> class MyThing:
… def __init__(self, thing1, thing2, thing3):
… self.thing1 = thing1
… self.thing2 = thing2
… self.age = thing3

… def __str__(self):
… return f”{self.thing1} has 2 {self.thing2} and 3 {self.age}.”

>>>
>>> thingSample = MyThing(“Something1”, “Something2”, “Something3″)
>>> print(f”{thingSample}”)
Something1 has 2 Something2 and 3 Something3.

What if you need to have a literal curly brace?
>>> print(f”Hi {first}, this output needs a {{curly brace}}”)
Hi Lisa, this output needs a {curly brace}

Two Approaches to Using PIP Through an Integrated Authenticated Proxy

The proxy at work uses integrated authentication. While BASIC auth prompts happily let you use a proxy of http://uid:pass@proxy:port, ours does not. There are two ways I’ve managed to use pip to install packages.

Proxied Proxy

The easiest approach is to use something that can handle the authenticated proxy, like Fiddler, as an intermediary. I do the same thing with perl’s PPM, docker pull … basically anywhere I’ve got to use a proxy that wants to pass through a proxy username.

Select Tools => Options to open the configuration dialog. Make sure you are handling SSL traffic — if not, check the box to “Capture HTTPS CONNECTs, “Decrypt HTTPS traffic”, and “Ignore server certificate errors” (it’s only unsafe if you don’t understand what you’re doing … don’t log into your bank account bouncing traffic through this config!)

On the “Connections” tab, check the port on which Fiddler is listening. If you cannot install Fiddler on the same box where you want to use pip, you’ll need to check off “Allow remote computers to connect” (and you won’t use localhost as the proxy hostname). Click OK, start capturing traffic (F12), and we’re ready to go.

Use the PIP command line to install the package but proxy the request through your Fiddler instance. In this example, Fiddler is installed on the local box and uses port 8888.

pip –trusted-host pypi.org –trusted-host files.pythonhosted.org –proxy http://localhost:8888 install SomePackage

This is nice because pip will automatically resolve dependencies. Not great if you’re not allowed to install your own software and cannot get Fiddler installed.

Dependency Nighmare

Back in the early days of Linux (think waaaay before package managers working against online repositories), we called this “dependency hell” — navigating dependency chains to get all of the required “stuff” on the box so you can install the thing you actually wanted.

Make a folder for all these wheels we’re going to end up downloading so it’s easy to clean up once we’re done. Search PyPi for the package you want. On the package page, select ‘Download Files’ and then download the whl

Use “pip install something.whl” to attempt installing it. If you’re lucky, you’ve got all the dependencies and the package will install. If you don’t have all of the dependencies, you’ll get an error telling you the first missing one. Go back to the pypi website & get that. Use “pip install somethingelse.whl” to install it and maybe get a dependency error here too. Once you get a dependency installed, try the package you want again. Got another error? Start downloading and installing again. Eventually, though, you’ll get all of the dependencies satisfied and be able to install the package you actually want.

Adding Python to Fedora Alternatives

Gimp installed Python 2.7). Which, of course, took over my system so nothing was using Python 3 anymore. We’ve used ‘alternatives’ to manage the Java installation, and I thought that might be a good solution in case I ever need to use Python 2

Add both Python versions to alternatives:

[lisa@fedora ~]# sudo alternatives –install /usr/bin/python python /usr/bin/python3.7 1
[lisa@fedora ~]# sudo alternatives –install /usr/bin/python python /usr/bin/python2.7 2

Select which one you want to use:

[lisa@fedora ~]# sudo alternatives –config python

There are 2 programs which provide ‘python’.

Selection Command
———————————————–
1 /usr/bin/python3.7
+ 2 /usr/bin/python2.7

Enter to keep the current selection[+], or type selection number: 1

And, of course, repeat the process for PIP:

[lisa@fedora ~]# sudo alternatives –install /usr/bin/pip pip /usr/bin/pip2.7 2
[lisa@fedora ~]# sudo alternatives –install /usr/bin/pip pip /usr/bin/pip3.7 1

[lisa@fedora ~]# sudo alternatives –config pip

List locally installed Python modules

I’ve been helping someone else get an Azure bot running on their system … which involves a lot of “what do I have that you don’t” … for which listing locally installed python modules is incredibly helpful.

python -c “import pkg_resources; print([(d.project_name, d.version) for d in pkg_resources.working_set])”

 

[lisa@cent6 ljr]# python -c “import pkg_resources; print([(d.project_name, d.version) for d in pkg_resources.working_set])”
[(‘scipy’, ‘1.2.0’), (‘scikit-learn’, ‘0.20.2’), (‘PyYAML’, ‘3.13’), (‘PyMySQL’, ‘0.9.3’), (‘pycares’, ‘2.4.0’), (‘numpy’, ‘1.16.0’), (‘multidict’, ‘4.5.2’), (‘Cython’, ‘0.29.4’), (‘coverage’, ‘4.5.2’), (‘aiohttp’, ‘3.0.9’), (‘yarl’, ‘1.3.0’), (‘wrapt’, ‘1.11.1’), (‘vcrpy’, ‘2.0.1’), (‘typing’, ‘3.6.6’), (‘sklearn’, ‘0.0’), (‘singledispatch’, ‘3.4.0.3’), (‘sharepy’, ‘1.3.0’), (‘requests-toolbelt’, ‘0.9.1’), (‘requests-oauthlib’, ‘1.2.0’), (‘pytest’, ‘4.1.1’), (‘pytest-cov’, ‘2.6.1’), (‘pytest-asyncio’, ‘0.10.0’), (‘PyJWT’, ‘1.7.1’), (‘py’, ‘1.7.0’), (‘pluggy’, ‘0.8.1’), (‘oauthlib’, ‘3.0.1’), (‘nltk’, ‘3.4’), (‘msrest’, ‘0.4.29’), (‘more-itertools’, ‘5.0.0’), (‘isodate’, ‘0.6.0’), (‘ConfigArgParse’, ‘0.14.0’), (‘certifi’, ‘2018.11.29’), (‘botframework-connector’, ‘4.0.0a6’), (‘botbuilder-schema’, ‘4.0.0a6’), (‘botbuilder-azure’, ‘4.0.0a6’), (‘azure-devtools’, ‘1.1.1’), (‘azure-cosmos’, ‘3.0.0’), (‘attrs’, ‘18.2.0’), (‘atomicwrites’, ‘1.2.1’), (‘async-timeout’, ‘2.0.1’), (‘aiodns’, ‘1.2.0’), (‘botbuilder-core’, ‘4.0.0a6’), (‘systemd-python’, ‘234’), (‘smartcols’, ‘0.3.0’), (‘setools’, ‘4.1.1’), (‘rpm’, ‘4.14.2.1’), (‘gpg’, ‘1.11.1’), (‘cryptography’, ‘2.3’), (‘cffi’, ‘1.11.5’), (‘urllib3’, ‘1.24.1’), (‘SSSDConfig’, ‘2.0.0’), (‘slip’, ‘0.6.4’), (‘slip.dbus’, ‘0.6.4’), (‘six’, ‘1.11.0’), (‘setuptools’, ‘40.4.3’), (‘sepolicy’, ‘1.1’), (‘requests’, ‘2.20.0’), (‘PySocks’, ‘1.6.8’), (‘pyparsing’, ‘2.2.0’), (‘pyOpenSSL’, ‘18.0.0’), (‘pykickstart’, ‘3.16’), (‘PyGObject’, ‘3.30.4’), (‘pycparser’, ‘2.14’), (‘ply’, ‘3.9’), (‘pip’, ‘18.1’), (‘ordered-set’, ‘2.0.2’), (‘isc’, ‘2.0’), (‘IPy’, ‘0.81’), (‘iotop’, ‘0.6’), (‘iniparse’, ‘0.4’), (‘idna’, ‘2.7’), (‘distro’, ‘1.3.0’), (‘decorator’, ‘4.3.0’), (‘chardet’, ‘3.0.4’), (‘asn1crypto’, ‘0.24.0’)]

Installing pycrypto On WIN10 with Visual Studio 2017

Yes, I know it’s old and no longer maintained … but it’s also what my function already uses, and I don’t want to take a few hours to get moved over to pycryptodome. You can install pycrypto on Windows 10 running Visual Studio 2017. But there’s a trick. Find the location of stdint.h in your VS installation.

Open x86_x64 Cross Tools Command Prompt For VS 2017

Run:

set CL=-FI"C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\VC\Tools\MSVC\14.14.26428\include\stdint.h"

where the bit in quotes is the path to your stdint.h (VS build version may differ, which will change the path).

Now, from within the Cross Tools Command Prompt, run:

pip install pycrypto

Voila — it installs:

Did you know … you can post data to Teams channels from your own code?

This post contains more niche “for developers” information than most of my Teams info series 😊

You’ve seen third-party services with connectors in Teams – both services with connectors published to Teams and services that can use the “Incoming Webhook” connector to post data. Youcan use the “Incoming Webhook” connector within your code too. If you want to allow others to post information into their Teams spaces, you’ll need a configuration option that allows end-users to supply the webhook URL. If you want to post information into your Teams space, then you can include it directly in your code. If you are doing the former, provide something like the “Creating a Teams Webhook URL” instructions below to your end users. In this example, I am doing the later. The example script is Python code that gathers statistics and posts a summary to my Teams space.

Creating a Teams Webhook URL:

Create a URL for your incoming webhook – from the not-quite-a hamburger menu next to the channel into which you want to publish your data, select “Connectors”, find “Incoming Webhook”, and click “Add” (if you have already configured a webhook in your Teams space, you will see “Configure”instead, and will not have to ‘install’ the webhook)

Read the privacy policy and terms of use, and if they are acceptable click “Install”

Provide a name for your webhook – this name will be displayed on all posts made via this webhook. Scroll down.

You can customize the logo associated with your webhook posts – nice if your application has a well-known logo. Click create to generate the webhook URL.

Copy the URL somewhere, then click ‘Done’.

Using Teams Webhooks Within Code:

OK, now you’ve got something like this in a text document (no, I didn’t post a real webhook to the Internet – the long pseudo-random alphanumeric string is hex, and there aren’t a whole lot of m’s and q’s in hex!):

https://outlook.office.com/webhook/bge9922d-44fa-4c60-bp59-e935554ff4cd@2567m4c1-b0ep-40f5-aee3-58d7c5f3e2b2/IncomingWebhook/644915ga291e4ee499f2479a32qde691/9e0c4d6c-65d9-4f94-b0d5-4fbbb0238358

What do you do with it? You make a POST call to that URL and supply JSON-formatted card content in the data. Microsoft provides a complete card reference, but you’ll need to use the O365Connector Card with the incoming webhook connector.

The card requires “summary” or “text” be included – you’ll get a bad data HTTP response if you fail to set one of these values. Card text can be formatted in markdown or HTML – if you want to use HTML, you need to set markdown to false.

You’ll need a function that POSTs data to a URL:

################################################################################
# This function POSTs to a URL
# Requirements: requests
# Input: strURL -- URL to which data is posted
#        strBody -- content to be sent as data
#        strContentType -- Content-Type definition
# Output: BOOL -- TRUE on 200 HTTP code, FALSE on other HTTP response
################################################################################
def postDataToURL(strURL, strBody, strContentType):
    if strURL is None:
        print("POST failed -- no URL provided")
        return False
    print("Sending POST request to strURL=%s" % strURL)
    print("Body: %s" % strBody)
    try:
        dictHeaders = {'Content-Type': strContentType}
        res = requests.post(strURL, headers=dictHeaders,data=strBody)
        print(res.text)
        if 200 <= res.status_code < 300:
            print("Receiver responded with HTTP status=%d" % res.status_code)
            return True
        else:
            print("POST failed -- receiver responded with HTTP status=%d" % res.status_code)
            return False
    except ValueError as e:
        print("POST failed -- Invalid URL: %s" % e)
    return False

And you’ll need something to create a card and call the HTTP POST function. Here, I define a function that takes statistics on Windstream’s Microsoft Teams usage, formats a card with the values, and POSTs it to my webhook URL.

################################################################################
# This function posts usage stats to Teams via webhook
# Requirements: json
# Input: strURL -- webhook url
#        iPrivateMessages, iTeamMessages, iCalls, iMeetings -- integer usage stats
#        strReportDate - datetime date for stats
# Output: BOOL -- TRUE on 200 HTTP code, FALSE on other HTTP response
################################################################################
def postStatsToTeams(strURL,iPrivateMessages,iTeamMessages,iCalls,iMeetings,strReportDate):
    try:
        strCardContent = '{"title": "Teams Usage Statistics","sections": [{"activityTitle": "Usage report for ' + yesterday.strftime('%Y-%m-%d') + '"}, {"title": "Details","facts": [{"name": "Private messages","value": "' + str(iPrivateMessages) + '"}, {"name": "Team messages","value": "' + str(iTeamMessages) + '"}, {"name": "Calls ","value": "' + str(iCalls) + '"}, {"name": "Meetings","value": "' + str(iMeetings) + '"}]}],"summary": "Teams Usage Statistics","potentialAction": [{"name": "View web report","target": ["https://csgdirsvcs.windstream.com:1977/o365Stats/msTeams.php"],"@context": "http://schema.org","@type": "ViewAction"}]}'
        jsonPostData = json.loads(strCardContent)

        if postDataToURL(strURL, json.dumps(jsonPostData),'application/json'):
            print("POST successful")
            return True
        else:
            print("POST failed")
            return False
    except Exception as e:
        print("ERROR Unexpected error: %s" % e)
    return False

Run the program, and you’ll see information in Teams:

A personal recommendation based on my experience with third-party code that uses generic incoming webhooks — have a mechanism to see more than a generic error when the POST fails. It takes a lot of effort to pull apart what is actually being sent, turn it into a curl command to reproduce the event, and read the actual error.Providing a debug facility that includes both the POST body and actual response from the HTTP call saves you a lot of time should your posts fail.

LDAP Auth With Tokens

I’ve encountered a few web applications (including more than a handful of out-of-the-box, “I paid money for that”, applications) that perform LDAP authentication/authorization every single time the user changes pages. Or reloads the page. Or, seemingly, looks at the site. OK, not the later, but still. When the load balancer probe hits the service every second and your application’s connection count is an order of magnitude over the probe’s count … that’s not a good sign!

On the handful of sites I’ve developed at work, I have used cookies to “save” the authentication and authorization info. It works, but only if the user is accepting cookies. Unfortunately, the IT types who typically use my sites tend to have privacy concerns. And the technical knowledge to maintain their privacy. Which … I get, I block a lot of cookies too. So I’ve begun moving to a token-based scheme. Microsoft’s magic cloudy AD via Microsoft Graph is one approach. But that has external dependencies — lose Internet connectivity, and your app becomes unusable. I needed another option.

There are projects on GitHub to authenticate a user via LDAP and obtain a token to “save” that access has been granted. Clone the project, make an app.py that connects to your LDAP directory, and you’re ready.

from flask_ldap_auth import login_required, token
from flask import Flask
import sys

app = Flask(__name__)
app.config['SECRET_KEY'] = 'somethingsecret'
app.config['LDAP_AUTH_SERVER'] = 'ldap://ldap.forumsys.com'
app.config['LDAP_TOP_DN'] = 'dc=example,dc=com'
app.register_blueprint(token, url_prefix='/auth')

@app.route('/')
@login_required
def hello():
return 'Hello, world'

if __name__ == '__main__':
app.run()

The authentication process is two step — first obtain a token from the URL http://127.0.0.1:5000/auth/request-token. Assuming valid credentials are supplied, the URL returns JSON containing the token. Depending on how you are using the token, you may need to base64 encode it (the httpie example on the GitHub page handles this for you, but the example below includes the explicit encoding step).

You then use the token when accessing subsequent pages, for instance http://127.0.0.1:5000/

import requests
import base64

API_ENDPOINT = "http://127.0.0.1:5000/auth/request-token"
SITE_URL = "http://127.0.0.1:5000/"

tupleAuthValues = ("userIDToTest", "P@s5W0Rd2T35t")

tokenResponse = requests.post(url = API_ENDPOINT, auth=tupleAuthValues)

if(tokenResponse.status_code is 200):
jsonResponse = tokenResponse.json()
strToken = jsonResponse['token']
print("The token is %s" % strToken)

strB64Token = base64.b64encode(strToken)
print("The base64 encoded token is %s" % strB64Token)

strHeaders = {'Authorization': 'Basic {}'.format(strB64Token)}

responseSiteAccess = requests.get(SITE_URL, headers=strHeaders)
print(responseSiteAccess.content)
else:
print("Error requesting token: %s" % tokenResponse.status_code)

Run and you get a token which provides access to the base URL.

[lisa@linux02 flask-ldap]# python authtest.py
The token is eyJhbGciOiJIUzI1NiIsImV4cCI6MTUzODE0NzU4NiwiaWF0IjoxNTM4MTQzOTg2fQ.eyJ1c2VybmFtZSI6ImdhdXNzIn0.FCJrECBlG1B6HQJKwt89XL3QrbLVjsGyc-NPbbxsS_U:
The base64 encoded token is ZXlKaGJHY2lPaUpJVXpJMU5pSXNJbVY0Y0NJNk1UVXpPREUwTnpVNE5pd2lhV0YwSWpveE5UTTRNVFF6T1RnMmZRLmV5SjFjMlZ5Ym1GdFpTSTZJbWRoZFhOekluMC5GQ0pyRUNCbEcxQjZIUUpLd3Q4OVhMM1FyYkxWanNHeWMtTlBiYnhzU19VOg==
Hello, world

A cool discovery I made during my testing — a test LDAP server that is publicly accessible. I’ve got dev servers at work, I’ve got an OpenLDAP instance on Docker … but none of that helps anyone else who wants to play around with LDAP auth. So if you don’t want to bother populating directory data within your own test OpenLDAP … some nice folks provide a quick LDAP auth source.