JSON in Snowflake

 Snowflake is an amazing database. It has auto-scaling capabilities, the ability to separate workloads, and automatically recover data using its time-travel capabilities. 

However, one of the most amazing things that I like about Snowflake is its ability to directly query JSON data stored in a variant column Query JSON Data 

Combine this with some other tools like Kakfa and Dbezium and you have some very rapid prototyping capabilities for doing analysis of your application database. 

 Dbezium monitors the transaction logs of your application database, and publishes change events to Kafka. 

Using the Snowflake Kafka Connector, you can capture these events and store the data in what would otherwise be called staging tables. 

The Kafka Connector stores the data as JSON, but Snowflake's ability to query JSON data is simple enough that any SQL developer can extract the pertinent data from the JSON Structure. 

An example from their documentation is:

SELECT src:device_type
  FROM raw_source
The src column of the raw_source table is where the JSON is stored. Once you know the JSON structure you want to query the :COLUMNAME is all that is needed to query the data from the variant column where the JSON data is stored. 
For more complicated JSON structures you may need to use some of the flattening technique in order to get all of the data from the JSON structure. 

With tools like this rapid prototyping is a breeze, the more complicated ETL jobs may not be necessary if the goal is to rapidly create a dimensional data model from your source application, then expose that data model with your favorite reporting tool (like Tableau

Combining data from multiple sources using this technique is also very straightforward so long as there are common keys in the systems that relates the data together. 

This allows architects to rapidly play The Enrichment Game, by combining data from  multiple sources, rationalizing the data, enriching one application with the data of another, then expose that data through a reporting tool. 

There are a number of problems I had to solve in different ways without having this capability. Using this particular combination of tools to rapidly create reports and respond to business needs drives more conversations about using Data to solve problems. 
These conversations are much more valuable rather than the conversations about how complicated a Data Ecosystem is, and the need to either write more Data Munging code in order to get data migrated from one system to another. 


No Response?

Does No Response mean: Yes, No, or Piss off? 

I am an old guy. I know this means I have certain expectations for the way things should be done. Please and Thank you never go out of style. Sorry is a sign of weakness, but if you make a mistake or miscommunication you should acknowledge that. 

One of the earliest lessons in protocol I learned in Marine Corps boot camp. Our platoon was waiting on something (which is a good portion of what boot camp is all about learning to be prepared for your next set of instructions.) Our Drill Instructor told us to watch something. 

There were a group of officers some distance away waiting for their next set of instructions, and a lone recruit was walking towards them. 

The recruit stopped, saluted,and requested permission to walk by. 

Every officer in the group of about 8 returned the salute. 

The Drill Instructor then got our attention and in words I don't quite recall said something to this affect. 

The Salute is a sacred duty. Every Marine has to salute superior officers. But here is the duty behind the salute. Every officer must acknowledge and return the salute. 

The lowliest Private can Salute the commandant of the Marine Corps, the Commandant will return the Salute. It is an acknowledgement of our shared duty and heritage as Marines. 

This may be a touching story, but what does this have to do with anything? 

For those who are a Director, Vice President, or C-level executive, if one of your people in your reporting structure communicates something to you. You should at the very least acknowledge the communication. 

Something as simple as, "I have received your message, I need to think about it.", a thumbs up on Slack, or even a simple OK is better than nothing. 

Communication goes both ways. 

We are flooded with communication today. Slack, Teams, email, Text, alerts, etc. 

Many of these are automated and do not stop until they are acknowledged. 

Should we not have enough professionalism to at least give a little: 


Poor Planning

Poor Planning on your part does not constitute an Emergency on my part.  

(Except when it does.)

Recently I was involved in a major hardware failure at work.

There had been indications our disk storage was under duress for quite some time. It had been in my status reports for months, I finally quit reporting on it since none of our leadership team even acknowledged the concern.

They would say, yeah we know we need to do something there. Or even better: Your predecessor complained about that as well. 

Over a long weekend involving most of my team, teams from other groups, vendors from infrastructure support and hardware vendors we fought for our customers not management.

War room calls were set up to run around the clock with people stepping in and out of the meeting we finally were able to get things ironed out. 

But at what cost? 

To proactively take care of this issue would have required potentially spending a bit of money up front to either get new infrastructure or upgrade it.

However to be down for the amount of time that we were down we broke faith with our customers.

I don't know the long term impacts of that on our organization. 

In cases like this it is never a good idea to say "I told you so."

A current joke going around says: "At the start of every disaster movie there's a scientist being ignored"

When you are in the middle of a disaster even if you were the one that could have prevented it, the only thing that you can do is to work hard on the recovery.

There is a tendency to shoot the messenger when dealing with issues, but that should be the last thing on anyones mind. The messenger is the one who knew about the problem the longest. Usually they are the person that has been kept up at night stressing over what to do.

The messenger probably has the most ideas about how to solve the problem.

Knee-jerk reactions do not create long term solutions. Careful design, planning, testing, and proactively building in resiliency and stability create long term solutions.

None of these are cheap, but as the saying goes: "Pay me now, or pay me later."

Proactive "pay me now" situations appear to be expensive.

Reactive "pay me later" situations make the proactive look like nickels and dimes.

When you are getting warnings about things that need to be addressed, don't ignore them. If you are the ones giving the warnings, don't give up.

Keep warning.

Keep telling.

Above all, Keep planning.

When the disaster strikes, someone has to be the voice of reason.


Question The Answers.

Some time ago I wrote an entry on the difference between Data Science and Business Intelligence:

I recently came across this quote:

Advances are made by answering questions. Discoveries are made by questioning answers” —Bernard Haisch.

I think there is a relationship between this quote and that previous post.

In essence what I was attempting to say was that Business Intelligence is generally a process that your data flows through that enriches application Data and prepares it such that it can be used to answer questions. These questions may be simple:

  1. How many widgets did this business unit produce last quarter? 
  2. How many did that business unit sell last quarter? 
  3. Which sales person sold what percentage last quarter? 
  4. What is the recurring cost of this Customer? 

These are all important questions. However, this same data should be used as part of any predictive effort. If you are using different data for your data science efforts and your business intelligence efforts then as you chart new territory through Data Science, your Business Intelligence platform will assist in showing the value of the Data Science effort. 

These two sides of a similar coin can and should be complementary. 

Business Intelligence will drive your business forward, Data Science will show you the direction you should go. 


How to create a mathemaical model

One day I was speaking to a friend of mine we were telling stories about some of the previous jobs we had held. He had been a teller at a bank. He asked me how they train tellers to recognize counterfeit money. It's really easy, he said, they never give them counterfeit money to work with. They always work with real bills when they practice counting and such. Then when you get a counterfeit it feels funny.

I recently recalled this conversation when I was looking for a pattern in some numbers.
As I copied the numbers into excel, I realized the progression kinda looked like a logistic progression with a base of 20. This tiny insight allowed me to do some further searches and find that there is a mathematical model that already represents the data I was looking at: Watts & Strogatz_model

By no means does this make me an expert in this area, but it did drive home for me the value of studying formal mathematical models.

I think the more that one studies formal models, whether they are in your current domain or not,  the more familiar you become with various types of models the better you will be at creating models yourself.

After all, a mathematical model is a set of rules that describe the behavior of data. Understanding how data behaves in various scenarios will improve your ability to recognize a pattern.