Sunday, June 20, 2021

World WiFi Day 2021!

World WiFi Day

We (human beings) have several weird superpowers, but the ability to communicate over vast distances has always fascinated me the most.

I've had the privilege of meeting some of the most truly capable pioneers in this field - but the reality here is that we're faced with a very unequal world.

Authors like William Gibson and Neal Stephenson have the right of things as well and while we're not quite living in that dystopian future, technology can become a great equalizer.

So yeah - as telecommunications operators we have the responsibility to bridge this gap!

Learn More

I'm always surprised by how much there still is to learn, even in fields I feel like I already know. Here are a few learning approaches that will help you build out a good foundation for learning Wi-Fi (and more!):

  • Amateur Radio: It's cheesy, sure. I'm amazed at how much my grown-up self can now do with amateur radio - I've been licensed since the '90s (KL0NS) and the community is doing so much more now than ever. When I was very young, this was a good opportunity to learn the principles of radio outdoors.
    • In Anchorage, Alaska we have it pretty good - KL7AA is a self-provided test provider, and they sell the study book I used way back when
    • Everywhere else in the US, the test costs $15 and probably takes about 2 days to study for. There's no reason not to try it out and participate. Most of the study material is free, and we even get practice tests
    • ARRL provides good ways to participate
    • If you join a radio club they'll find different ways to exercise your brain, and they're usually pretty fun. In addition, you'll be helping maintain emergency communication networks.
  • Get Certified. Also pretty cheesy, I know how people feel about IT certs and still would argue for in this case. For Wi-Fi, the CWNP organization tends to serve the same role as the Linux Professional Institute - employers don't know about them but it's really effective in terms of education.

Do More

Let's just cover some volunteer opportunities here - because there's no point in building skills if you don't use them:

  • World Wi-Fi Day
  • ITDRC These guys are really neat. The IT Disaster Resource Center leverages oldie-but-goodie enterprise telecom/IT equipment to provide disaster relief all over the continental US. Check out their deployment map!
  • Airheads Volunteer Corp. I know this is a vendor plug, but this approach is really cool if you can travel!
  • United Way

On Volunteering

One of the passive effects of these approaches - you'll get better as you go. Employers nearly always constrain your learning path to what they need at the moment, often to their own detriment. They may not know what they'll need you to do next year, COVID showed us that. Volunteering not only gives you an opportunity to help others but also passively improves your skills outside of the usual "corporate playbook".

Sunday, June 6, 2021

XML, JSON, YAML - Python data structures and visualization for infrastructure engineers

At some point, we can't "do it all" with one block of code. 

As developers, we need to store persistent data for a variety of reasons:

  • We want it for later execution (or to compare it to another result)
  • We're sick of storing variables in code. This matters a lot more in compiled languages than runtime ones
  • We want the results to end up in some form of a deliverable report

Let's cover a computer science concept being used here - semaphores). Edsger Dijkstra coined this term from Greek sema(sign) and phero(bearer) (you may remember him from OSPF) to solve Inter-Process Communications(IPC) issues.

To provide a reductionist example, process A and process B need to communicate somehow, but shouldn't access each other's memory or, in the '60s, it wasn't available. To solve this problem, the developer needs to develop a method of storing variables in a manner that is both efficient and can be consistently interpreted.

Dijkstra's example, in this case, was binary, and required a schema to interpret the same data - but was not specifically limited to single binary blocks. This specific need actually influenced one of the three data types we're comparing here - consequently the oldest.

But which one do I use? TL;DR?

Spoiler alert - anyone working with automation MUST learn all three to be truly effective. My general guidance would be:

  • This is a personal preference, but I would highly recommend YAML for human inputs. It's extremely easy to write, and while I generally prefer JSON it's much easier to first write a document into YAML and then convert it. If you take user input or just want to get a big JSON document started, I'd do it this way.
    • YAML User input drivers can also parse JSON, making this an extremely flexible approach.
  • JSON is good for storing machine inputs/outputs. Because all typing is pretty explicit with JSON, json.dumps(dict, indent=4) is pretty handy for previewing what your code thinks structured data looks like. Technically this is possible with YAML, but conventions on, say, a string literal can be squishy.
    • YAML with name: True could be interpreted as:
      • JSON of "name": true, indicating a Boolean value
      • JSON of "name": "True", indicating a String
    • Sure, this is oversimplified, and YAML can be explicitly typed, but generally, YAML is awesome for its speed low initial friction. If an engineer knows YAML really well (and writes their own classes for it) going all-YAML here is completely possible - but to me that's just too much work.
    • If you use it in the way I recommend, just learn to interpret JSON and use Python's JSON library natively, and remember json.dumps(dict, indent=4) for outputs. You'll pick it up in about half an hour and just passively get better over time.
  • Use XML if that's what the other code is using, Python's Element and ElementTree constructs are more nuanced than dictionaries, so a package like defusedXML is probably the best way to get started. There are a lot of binary invocations/security issues with XML, so using basic XML libraries by themselves is ill-advised. xmltodict is pretty handy if you just want to convert it into another format.

Note: JSON and XML both support Schema Validation, an important aspect of semaphores. YAML doesn't have a native function like this, but I have used Python's Cerberus modules to do the same thing here.

YAML

YAML was initially released in 2001 and has gained recent popularity with projects like Ansible. YAML 1.2 was released in 2009 and is publicly maintained by the community, so it won't have industry bias (but also won't change as quickly). YAML writes a lot like Python, consuming a ton of whitespace and being particular about tags. Users either love or hate it - I typically only use it for human inputs and objects that are frequently peer-reviewed.

NOTE: one big upside to YAML with people processes is comment support. YAML supports comments, but JSON does not.

YAML is pretty easy to start using in Python. I'm a big fan of the ruamel.YAML library, which adds on some interesting capabilities when parsing human inputs. I've found a nifty way to parse using try/except blocks - making a parser that is supremely agnostic, ingesting JSON or YAML, as a string or a file!

---
message:
  items:
    item:
      "@tag": Blue
      "#text": Hello, World!
#!/usr/bin/python3

import json

from ruamel.yaml import YAML
from ruamel.yaml import scanner

# Load Definition Classes
yaml_input = YAML(typ='safe')
yaml_dict = {}

# Input can take a file first, but will fall back to YAML processing of a string
try:
    yaml_dict = yaml_input.load(open('example.yml', 'r'))
except FileNotFoundError:
    print('Not found as file, trying as a string...')
    yaml_dict = yaml_input.load('example.yml')
finally:
    print(json.dumps(yaml_dict, indent=4))

JSON

JSON was first implemented in 2006 and is currently maintained by the IETF. Currently, Python 3 will visually represent dicts using JSON as well - making things pretty intuitive. In my experience, writing JSON is pretty annoying because it's picky.

{
    "message": {
        "items": {
            "item": {
                "@tag": "Blue",
                "#text": "Hello, World!"
            }
        }
    }
}
#!/usr/bin/python3

import json

with open('example.json', 'r') as file:
    print(json.dumps(json.loads(file.read())))

Typically, I'll just use json.dumps(dict, indent=4) on a live dict when I'm done with it - dumping it to a file. JSON is a well-defined standard and software support for it is excellent.

Due to its IETF bias, JSON's future seems to focus on streaming/logging required for infrastructure management. JSON-serialized Syslog is a neat application here, as you can write it to a file as a single line, but also explode for readability, infuriating grep users everywhere.

XML

XML is the oldest data language typically used for automation/data ingestion, and it really shows. XML was originally established by the W3C in 1998 and is used for many document types like Microsoft Office.

XML's document and W3C bias read very strongly. Older Java-oriented platforms like Jenkins CI heavily leverage XML for semaphores, document reporting, and configuration management. Strict validation (MUST be well-formed) required for compiled languages to synergize well with the capabilities provided. XML also heavily uses HTML-style escaping and tagging approaches, making it familiar to some web developers.

XML has plenty of downsides. Crashing on invalid input is generally considered excessive or "Steve-Ballmer"-esque, making the language favorable for mission-critical applications where misinterpretation of data MUST not be processed, and miserable everywhere else. For human inputs, it's pretty wordy which impacts readability quite a bit.

Schemas

XML has two tiers of schema - Document Type Definition (DTD) and XML Schema. DTD is very similar to HTML DTDs and provides a method of validating that the language is correctly used. XML Schema definitions (XSD) provide typing and structures for validation and is a more commonly used tool.

Python Example

XML Leverages the Element and ElementTree constructs in Python instead of dicts. This is due to XML being capable of so much more, but it's still pretty easy to use:

XML Document:

<?xml version="1.0" encoding="ISO-8859-1" ?>
<message>
    <items>
        <item tag="Blue">Hello, World!</item>
    </items>
</message>
#!/usr/bin/python3

from defusedxml.ElementTree import parse
import xmltodict
import json

document = parse('example.xml').getroot()

print(document[0][0].text + " " + json.dumps(document[0][0].attrib))

file = open('example.xml', "r").read()
print(json.dumps(xmltodict.parse(file), indent=4))

After using both methods, I generally prefer using xmltodict for data processing - it lets me use a common language, Python lists and dicts to process all data regardless of source, allowing me to focus more on the payload. We're really fortunate to have this fantastic F/OSS community enabling that!

Get an A on ssllabs.com with VMware Avi / NSX ALB (and keep it that way with SemVer!)

Cryptographic security is an important aspect of hosting any business-critical service. When hosting a public service secured by TLS, it is ...