raku: Class Standards

for a while, it has been bugging me that teams, companies, institutions and tool vendors keep on re-inventing the same wheels

I have lost count of the ways that Contact information has been (re)implemented – by Apple, by Google, by pretty much every eCommerce site I visit (WooCommerce, Shopify, Stripe) and every CRM app (HubSpot, SalesForce) and so on. grrrr…

finally, there is a solution: Raku

UK BS7666 Part2 example – something to aspire to one day?

now, honestly, I am surprised and disappointed that Python is not already there (feel free to comment below if you think I have missed the news!) – perhaps the uncertainty around the various Python module managers had a chilling effect in the early days – or perhaps module makers feel that modules which do not have substantial algorithmic content or infrastructure integration are not hard core enough

the Requirement

so, big picture, what do I think we need in the world that we don’t already have?

  • a practical set of standard classes for common information types
  • a place to collaborate, iterate and improve the set
  • tooling that allows us to refactor and evolve code
  • built in text parsing that provides real value
  • free open source software licence model
  • version control
  • tests

and that’s when it dawned on me that raku provides a unique opportunity to do this

the Basics

in this context, the Python class model is very user friendly

class Contact:
def __init__(self, name, email, phone):
self.name = name
self.email = email
self.phone = phone

def display(self):
print(f"Name: {self.name}")
print(f"Email: {self.email}")
print(f"Phone: {self.phone}")

# Create a contact
john_doe = Contact(name="John Doe", email="john@example.com", phone="555-1234")

# Display the contact
print("Contact information:")
john_doe.display()

raku has similar user friendliness, albeit with a few nuances such as the $-sigil for vars and {} for blocks instead of just indentation

class Contact {
has $.name;
has $.email;
has $.phone;

method display {
say "Name: $.name";
say "Email: $.email";
say "Phone: $.phone";
}
}

# Create a contact
my $john_doe = Contact.new(name=>"John Doe", email=>"john@example.com", phone=>"555-1234");

# Display the contact information
say "Contact information:";
$john_doe.display;

either way, and these are very trivial examples, using a programmatic class model to define information brings some important benefits:

  • classes create a framework for aggregation
  • attributes (& accessors) label information items
  • methods (&roles) can provide encapsulated import / export logic

the Exchange

for Python, add this:

import json 

...

    def to_json(self):
contact_json = {
"name": self.name,
"email": self.email,
"phone": self.phone
}
return json.dumps(contact_json)

...

# Convert the contact information to JSON and print it
print("Contact information in JSON:")
json_contact = john_doe.to_json()
print(json_contact)

for raku, go:

use JSON::Class;

...

class Contact is json { ... }

...

# Convert the contact information to JSON and print it
say "Contact information in JSON:";
say $john_doe.to-json;

these both produce:

Contact information in JSON:
{"name":"John Doe","email":"john@example.com","phone":"555-1234"}

its easy to serialize our data in a language / platform / application independent way so that it can be easily passed around, embedded in urls and saved in compatible databases

classes can be inherited and specialized by application-specific extensions if required

— intermission —

hopefully, so far, I have explained the benefits of using a dynamic, user friendly, Object Oriented language such as Python or Raku (there are others, too) as the basis for definition and maintenance of classes as evolving data standards for commonly (re-)used information

we all need to reach for (eg.) contact information and schemas from time to time, but let’s face it, the specific implementation for most coders is not important with no unique business value. so it is far better to have a FOSS methodology to manage, improve and standardize our formats across our respective employers and institutions

that way we can easily:

  • get up and running in minutes, with a set of common libraries
  • interoperate across languages, layers and platforms
  • provide input/export method via multiple formats e.g. json (via roles)
  • make PRs for community adoption to add our own improvements

why hasn’t this happened already? the minimum requirement is a user-friendly OO class system, plus a stable package manager … plus the intent to employ the module ecosystem for class-based data standardization as well as access to eg. system libraries

why Raku? it provides these prerequisites in a relatively new, user-friendly and powerful language & ecosystem that is a great fit for Class Standards… so the door is open

why raku Modules?

the big idea #1 is to use the built in raku module features and the raku module ecosystem (zef) to apply community-wide version control, here you can see how easy it is to specify release, author and api versions at publication and at use

unit module Contact:ver<4.2.3>:auth<zef:jane>:api<1> { ... }

in your code, you can consume these information standards with tight control over versions (the + denotes ver greater than or equal to 1.0), this use pins the auth and is agnostic to api

use Contact:auth<zef:jane>:ver<v1.0+>;

the Raku module ecosystem helps with discoverability (raku.land), security (author authentication) and ease of creation (fez upload)

there are 2,777 raku modules available via raku.land as at the time of writing and multiple new releases every week

since Raku has a unified and consistent approach to package management and version control, it is a better choice than Python

why raku Types & Roles?

by now I can hear comments from the back “why not use the raku typesystem” to better control our data standards?” quite right too, here’s how that looks in a rather more functional variant of our earlier example:

use JSON::Class:auth<zef:vrurg>;
use Contact::Address;

role Contact is json {
has Str $.text is required;
has Str $.country is required where * eq <USA UK>.any;

has Str $.name is json;
has Address $.address is json;
has Bool $.is-company;
has Str $.company;
has Str @.email;
has Str @.phone;

    ...

method Str {
my @blocks = (
self.name,
self.address,
);

@blocks.join(",\n")
}
}

this is taken from the initial release of the new raku Contact module

you will note that these attrs are ripe for improvement eg. by adding Name, Email and Phone custom types – which can be provided by a another raku module such as Email::Address or even from cpan Number::Phone

with the Contact::Address arranged country by country like this:

role Contact::Address {
method parse(Str $) {...}
method list-attrs {...}
method Str {...}
}

role Contact::AddressFactory[Str $country='USA'] is export {
method new { Contact::Address::{$country}.new }
}

class Contact::Address::USA does Contact::Address {
has Str $.street;
has Str $.city;
has Str $.state;
has Str $.zip;
has Str $.country = 'USA';

method parse($address is rw) {...}
}

class Contact::Address::UK does Contact::Address {
has Str $.house;
has Str $.street;
has Str $.town;
has Str $.county;
has Str $.postcode;
has Str $.country = 'UK';

method parse($address is rw) {...}
}

...

this is a vestigal plugin structure with many gaps for country specific parsers to be bundled with the Contact module, here’s how the lib tree looks:

raku-Contact/lib > tree
.
├── Contact
│   ├── Address
│   │   ├── GrammarBase.rakumod
│   │   ├── UK
│   │   │   └── Parse.rakumod
│   │   └── USA
│   │   └── Parse.rakumod
│   └── Address.rakumod
└── Contact.rakumod

my expectation is that this structure will get refactored and morph over time – for example, we may agree to have localized attribute names (and maybe to alias them) so that my Contact::Address is their Kontakt::Addresse

a word on collaboration – unlike much of my work around the raku module eco-system, adoption and success of the Contact module is 100% dependent on a community of contributors adding country-specific classes and parsers for Address, Name (Title), Phone and so on … there is also room for rich integrations such as Google Maps (or other) address lookup, HTML / CSS / JS / React forms generation and so on – I am happy to review PRs, discuss evolving structure and so on

please do join in the fun – this means you

~librasteve

why raku Grammars?

one jumping off point for this work was to spend some time with the raku DateTime::Parse module which has some great examples of how to integrate “role oriented behaviours” within a raku Grammar

and Contact fulfils a similar role in that each class incorporates a parse method that will extract and load the class attributes from a text file provided

that way the Contact module delivers value above just a simple definition of the information standard, but it also brings an ingestion engine build on raku Grammars, like this one:

class Contact::Address::USA::Parse {
has $.address is rw;
my @battrs;          #bound to attrs

grammar Grammar does GrammarBase {
#<.ws> is [\h|\v] (allows single & multi line layouts)
token TOP {
<street> \v
<city> ','? <.ws>
<state> <.ws>
<zip> \v?
[ <country> \v? ]?
}

token city { <nost-words> }
token state { \w ** 2 }
token zip { \d ** 5 }
token country { <whole-line> }
}

class Actions {
method TOP($/) {
make-attrs($/, @battrs)
}

method street($/) { make ~$/ }
method city($/) { make ~$/ }
method state($/) { make ~$/ }
method zip($/) { make ~$/ }
method country($/) { make ~$/ }
}

    ...

method parse {
Grammar.parse($!address.&prep, :actions(Actions)) or
               X::Contact::Address::USA::CannotParse.new(:$!address).throw;
$/.made
}
}

it is interesting, when building the code to weigh the benefits of code placement options – in this case, I was keen to keep related Grammar and Action classes together due to the tight interplay between them

there is some commonality between USA, UK and other addresses parsers which is managed by the role GrammarBase, but largely these parsers are quite different to account for radically different conventions in address layout, zipcode / postcode formats, state abbreviations and so on – so they each get a separate Parse.rakumod

DateTime is a great example of a “near core” set of information standards – the module efficiently implements a selection of the date time RFC standards available in computing – when you think of it there is a whole library of possible data standards out there that can employ and extend the general approach outlined here. pretty much anything that is often stored in a database table is fair game

why raku Test?

as you evolve a complex parser Grammar, you bring in test cases like these:

use v6.d;
use Test;

use Contact::Address;

my $addresses = q:to/END/;
123, Main St.,
Springfield,
IL 62704,
USA

123, Main St.,
Springfield,
IL 62704

123, Main St.,
Springfield,
IL
62704
...
END

my @addresses = $addresses.split(/\n\n+/);

for @addresses -> $address is rw {
lives-ok {AddressFactory['USA'].new.parse: $address}, 'lives-ok';
}

done-testing;

with slight differences in line layout, punctuation and so on (a maze of twisty passages all alike)

it is vital to maintain a representative set of (regression) tests to be sure that as you add a new variant, you do not break the parser for an old one

like the raku ROAST suite itself, the module test becomes the core definition of the acceptable syntax for our information standards – in the case of text to be read from the “wild” (csv files, LLM outputs, web scrapes, etc) our ambition is to be very open to any formats out there … once again this underlines the benefits of collaboration to grow the range of sample and test data

— conclusion —

I hope that I have made the case for raku classes to be used as standards for parsing, storing and exchanging common information types such as Contact and Address

if you were one of the elves paying attention to the raku advent blog, you will have already seen some more details on the software implementation here and here

Please do feel free to comment or feedback below, and to raise Issues and PRs over at the github repository

~librasteve

2 Comments

  1. Brian Julin says:

    Was dropping back by Raku weekly as I do from time to time and noticed this post.

    A few standards you should be aware of if you are going to start dealing with contacts… not contact related exactly, but the address portions are constantly under discussion in the context of emergency call response systems.

    As the situation now stands here in the U.S.A. most VoIP equipment sold supports very old standard called NENA ALI3 and roughly equivalent to the approved IEEE standard for DHCP rfc4776. There are subsequent revisions of this standard. There is also rfc5139 which has some level of support on some equipment, and then NG911 and more proposed standards after that. The term “GEOPRIV” will bring you into a nest of standards, and will “PIDF”

    I’d be remiss of I didn’t warn that this way lies madness. There may or may not be standards for conversion between the above standards. There is somewhere in the advanced emergency response standards (which cover way way way more than addresses and try to standardize the entire emergency response systems right down to stuff like identifying the make of car driven by a criminal suspect) where was i… oh… there is a protocol standardized for providing a web service to map the various standards to one another but I couldn’t find any core logic specified, just the communications protocol. There are a bunch of people constantly making pitches at conferences trying to either make standards more explicit or to simplify them. Lots of XML junkies. It’s a complete mess.

    Hopefully if you tread down this path, Raku will serve you well!

    Like

Leave a Comment