How many times do you have to stand in the rain before lightning strikes?

Recently, I have seen this picture a number of times:

This is a cute little anecdote about taking risks and being open to opportunities.

Here is a counter thought:

How many people were invited to other rooms, and those people are not billionaires now?

How many people have spent nights and weekends coding or building someone else's idea only to never see a dime?

Now don't get me wrong. I absolutely love working with entrepreneurs!

The excitement of a new idea, the thrill of building things from scratch, the camaraderie of working on something that is new and rushing to get something to market before someone else builds something similar.

These are fun things to work on.

However, everyone should be committed to the goal with the same amount of buy-in.

As I wrote about previously Beware The Partnership where the technology person or team is the only one working on the project. This is called contracting.

If you and a friend have an idea, and you are both working various angles on the idea, go for it.

If you are not a technology person, and you need a "partner" to do the actual building part, you have just hired a consultant. They may work with you for some sort of percentage of future ownership (sweat equity), but at some point sweat, motivational speeches, possible future options do not put food on the table.

Never be afraid to take risks.

The risk for the technologist is spending time, effort and expertise on a project that may never pay off.  My caution for you if you are a technologis is this: Don't expect to make your expected hourly rate. Be flexible, negotiate maybe even suggest that the idea person pay for equipment of some other tangible if they are unwilling or unable ot pay you directly.

The risk for the idea person is that you may be paying for something that does not quite fit in with your vision. My caution for you if you are an idea person is this: Either be willing to pay for expertise that you do not have, or simply do not talk about your idea with anyone. If you do not have the ability to pay for expertise, use Lean techniques to figure out the quickest path to make money with your idea. If you are currently working, leverage your savings or take a portion of your current income and save it till you can afford to pay for expertise, experience, equipment or some other tangible item to help you build your idea.

If you can't risk losing a bit of money(for the idea person), or time (for the technologist) then don't get involved in building out something.

If you are currently in either of these situations, and are uncomfortable talking about these things, share this link with your business partner. Have a conversation about the uncomfortable topic of money early on. If you are willing to commit your future to an idea with your partner, you should be willing to discuss money, and you should do it sooner rather than later.

You do have to take risks to be successful. Sometimes it simply rains.

On rare occasion lightning strikes and you are able to convert from working on a side project to doing something you love full time.

Either way, you will get wet.

The question is, how wet are you willing to get?

Will you take a bath, or be singing in the rain? 


Predictive Analytics World New York 2016 Supercharging with Ensemble Models.

The Wisdom of Crowds
The Wisdom of Crowds (Photo credit: Wikipedia)
Dean Abbott taught this class on Ensemble models.

One of the in class demonstrations was an example from The Wisdom of Crowds. He passed around a bottle with some cereal in it. Everyone guessed, and then he averaged the guesses.

Two people were closer than the true answer, but an Ensemble model (An average of all of our individual guesses based on our internal model of the bottle and the size of the cereal.)

There were also hands-on demonstrations with Salford Systems predictive modeler. You can find out more about the tool at this link

Dean is a thorough instructor, and clearly could educate all of us on the various ways of doing predictive modeling.

English: A manually drawn decision tree diagra...
He talked about Logistic and Linear regression, decision trees, random forests, and how to combine these specific models with various options as an Ensemble model.

He touched just briefly on deep-learning.

I look forward to hearing from him again, I think every time I would be able to hear from him I would learn something new.

Dean recommended to read his book: Applied Predictive Analytics Principles I look forward to starting to read this on the flight back.

It has been an exciting time in New York. Whenever I attend these conferences, and workshops I always feel like the more I learn, the less I know.

I did take a few pictures, and made a few tweets about one day speaking at a future event. I think I have a lot to learn to be on par with these speakers.

Continuous learning is the key to expertise.


Predictive Analytics World New York 2016 Day 2

Daniel Kahneman
Daniel Kahneman (Photo credit: Wikipedia)
Day 2 kicked off with a bang.

John Elder speaking about the way we think and perceive solving problems. Daniel Kahneman's concept of System 1 thinking versus System 2 thinking applies to predictive analytics because many times our System 1 thinking can overwhelm our System 2 thinking.

The excellent book Fast and Slow thinking by Kahneman is a great overview of these concepts.

Dean Abbot and Karl Rexler followed up the great kick-off with a Question and Answer session. A couple of the more interesting questions were:

"How do you merge Analysis frameworks with Agile Frameworks?" (Answer: It's hard)


"Should Data Science report to business units trying to solve a problem, or Information Technology departments where the same resources can be shared and leveraged across the organization?" (Answer: It depends on your organization and support for Data Science.)

Pasha Roberts gave an amazing overview of Talent Analytics approach to understand workforce movement and flow of employees through an organization. Agent based models, Markov-chains and directed Graphs were the details of how to solve this problem. I was on cloud 9 it is so refreshing to hear about applications of these techniques to solving business problems. Most people I speak to about these techniques I lose quickly. :)

A few more vendor presentations, and some Q&A sessions as well as talks on Design thinking and graph analysis of food recommendations rounded out the rest of the day.

Tomorrow is more workshops. I will be attending the workshop by Dean Abbott on Ensemble Models.

New York has been a great trip, it is always a great experience to spend time with professional peers that are wrestling with some of the same problems and challenges.


Predictive Analytics World New York 2016 - Day 1

English: Phases of the CRISP-DM process França...
English: Phases of the CRISP-DM process Français : Phases du processus CRISP_DM (Photo credit: Wikipedia)
Day 1 was exciting.

Great keynote by Eric Siegel on understanding whether a discovery you have made is BS or not.

(Bad Science for those of you not there.)

Vast search introduces new considerations in working with predictive models in large data sets, because, in a large enough data set almost any conditions can be found.

After all, "If you torture your data long enough, it will confess to anything."

Really knowing whether any finding is valid is very important when dealing with big data.

I followed the track related to Uplift Modeling.

I will need to go through Eric's book on Uplift modeling to best understand it but the examples provided in the sessions on Day 1 were enough to not only whet the appetite, but also dive in and do some experiments.

I also met with the fine people at Elder research, it turns out we both worked on a very similar government project some years ago.

They have done some research, and applications on the integration of CRISP-DM with the Agile framework. This is most intriguing I have to follow up with them to learn more about how they married such different methodologies.

At the end of day 1 I signed up for dinner with strangers, there were three groups of about 8 folks each who were sent to various restaurants throughout New York City. We had some good conversations about our individual struggles to bring Predictive Analytics to the masses.

I made some new networking connections, and look forward to staying in touch with some really positive professionals.


Predictive Analytics World New York 2016 pre-workshop

English: A principal Component Analysis Exampl...
English: A principal Component Analysis Example with air quality data available with R Français : Un exemple d'Analyse en Composante Pricipale avec les données de la qualité de l'air disponibles dans R (Photo credit: Wikipedia)
I am in New York City attending the Predictive Analytics World conference.

Max Kuhn is the speaker at the first session I attended "R for Predictive Modeling: A hands-On Introduction".

Max is a great speaker, and very knowledgeable about the topic. He has loads of experience in doing predictive modeling.

Every time I attend a course like this one, I learn that there is so much more I have to learn.

We covered details of topics covering
Principal component analysis, Feature selection, exploratory data analysis.

Various regression capabilities.

Many of the topics he covered he provided links to other blog posts, and github presentations that he has done it before.

These sessions are always fascinating seeing the variety of ways in which Predictive Analytics is used in many different environments.

I was able to speak to a few folks about the ways in which they are applying predictive analytics, and I look forward to more sessions as the conference proper kicks off tomorrow morning.

Most of the tracks on my schedule for tomorrow are all around uplift modeling I look forward to learning more both about how that works, and how to apply it.


Is Analytics a Noun or a Verb?

English: The syntax tree of noun phrase "...
English: The syntax tree of noun phrase "my neighbour's daughter-in-law" with layered determiner analysis. (Photo credit: Wikipedia)
Is Analytics the name of your department, or do you actually "do" Analytics?

Doing analytics requires you to look at your data, apply some logic, and make or support making a decision with the data.

For many years I have built and maintained analytical platforms. These platforms had the core of a Business Intelligence architecture with some one-offs for the occasional "sophisticated" analysis as needed. I was not specifically doing analytics during this time. I knew many of the tools and techniques that were being applied. At times, I was even the one writing the SQL queries to pull the data together to load into SAS for statistical modeling. However, I rarely took it so far as to actually do the Analytics myself. That was not my role.

Now I am in a position where I am the one doing the Analytics, and I see and recognize the impedance mismatch that occurs when I use the term analytics, versus when some people use the same term.

Data Analytics is a very overloaded term in today's environment.  Yet as sophisticated as we may be in evolving from our ancestors simple things still make a big difference.

Using incredibly simple definitions:
A Noun is a person, place or thing.

A Verb is an action, or state of being.

Analytics can be a noun. "I am in charge of the Analytics department!"

Analytics can also be a verb. "I applied Analytics to the data until it gave me the answer!"

Analysis, or analytical thinking is a way of learning from and understanding the data that we have available to us in order to solve a specific problem or answer a specific question.

I think how this word evolved to be a noun is that there have been times where people with analytical skills(verb) were gathered together in one place. In order to have a question answered you had to go to the Analytics department (now it is a noun - place.)

As this place evolved, the people doing the analysis needed support, programmers, managers, project managers, special coders,etc.

Now you can say you work in Analytics and mean the department. This carries some clout with it, because it sounds as if you have the skills and capabilities of those doing the analysis.

Not necessarily. You may learn some valuable things, and through the natural sequence of apprenticeship you may be able to be the one "doing analytics" at some point.

To me, Analytics is a Verb, and it should only be a verb. Using it in any other context is a disservice to the word.


The Little Data Science Checklist

English: Random data points and their linear r...There are lots of sources online and in book stores that will teach a person how to do Machine Learning, Regression, Text analysis, or any other fascinating topic related to Data Science.

But is that all there is?

Just know apply an algorithm, and you improve the bottom line of your company?

What if you need to justify a project? How do you demonstrate that there is a process that can be followed for data science?

Recently with some colleagues we were discussing this precise topic.

Here is the answer that we came up with:


            What is the question?

            Who asked the question?

            When do they want it?

            How does it provide business value?


            Validation Criteria

            Identify the data

            Collect the data

            Transform, Merge, Munge

            Analyze and explore


            Peer Review

            Visualize and Communicate

            Generate new directions.

            Produce Finding





            Experiment Required

Supporting Data:

            Raw Data


I will pick a section and write more about each of these sections, but this little checklist is a beginning step in applying a Data Science process for an Enterprise. 


The travelling introvert.

United Airlines Boeing 777–200 taking off at A...
On occasion I get to travel for work.

There is a commercial by courtyard where part of the the tag line is: "Some people have to travel for work, some people get to travel for work."

I am most certainly in the latter category.

I think one of the things that makes traveling as an introvert interesting is we do tend to pay attention to our surroundings a little better than others. Especially when I travel alone.

In her book  Susan Cain describes the physiological differentiation from an Extrovert and Introvert. The Introvert can be hyper stimulated by their environment. This stimulation can be managed through training, but at times it can overwhelm those who don't know how to handle it.

Recently I had a chance to travel to Boston for work. I wrote this little snippet on my Facebook page about a little girl I encountered during my layover:

Walking through this long terminal looking for my next gate, bumping into people and trying not to walk on people's feet it's hard to actually notice any individuals. But then I heard it.

I had to look around through the throngs of people to find her, but I was able to zero in on her song.

She was about 2, wearing a Wonder Woman shirt dragging a suitcase that was as big as her while holding mom's hand.

"Just keep swimming, just keep swimming", she sang.
"That's right", the Mom told her "we have a ways to go."
I almost started singing it back to her.
Made my day.

All of my children are grown, and when Finding Nemo came out they were all teenagers. But I can imagine when my kids where younger getting them to sing this song just to keep them occupied during long treks to wherever we needed to go.

There was one other little side jaunt that I was able to make during my short trip for work.

There is no need for a car when I stay in Boston, since my office is quite close to my hotel. However, I needed to get from Logan airport to my hotel.

There is a water taxi service.

It has a very reasonable rate, and you can travel across Boston Harbor. I had heard about it from  a coworker and thought I would try if I got the chance.

This trip, I got the chance.
Boston Harbor at sunset

Follow the signs in Logan for the water taxi till you reach the pier.

Then use the microphone to notify the taxi service what location you are at.

The short ride across the harbor at sunset was magnificent.

The photos I took of the excursion on the water do not do it justice.

To my fellow introverts, I encourage you to try something slightly new the next time you travel. If you have to be uncomfortable for a period of time stepping out of your comfort zone, find something interesting at your travel destination that won't take any time away from your travel objective, and try something new.

Remember, Life is what happens while you are making other plans.



The data guy deals with hardware issues.

Fedora logo
Fedora logo (Photo credit: Wikipedia)

Remember how to do this?

Since moving into our new house, I had a few other priorities to deal with before getting my personal computing equipment in order.

Before the days of "the cloud" if you wanted to do an coding, or data analysis you needed to have a place to put the data you need to analyze, and it had to have a decent amount of memory and processing power in order to do any kind of analysis on data larger than just a few meg.

I had bought a tower some years ago, I used it during some of my consulting projects, as well as some of my personal research (research I should have pursued further, but now apache parquet negates).

It may not be comparable to an EC2 machine, but it gets the job done for me. 1Tb, 4Gig memory, quad CPU. It has been in storage in a box since we moved away from Maryland (3 years ago).

I dusted it off, set it up, brought out its monitor and fired it up.

"It's alive!"

Yay. I have a tower again! (or so I thought).

I connected it to my router, and decided that since this was running like Fedora 12, and the current version of Fedora is 24. Well in order to do that, I need to download the live disk.

After getting DVDR's to write the ISO image onto, I realized something.

I could use the live CD to test my laptop.

<rabbit hole>

My personal laptop died shortly after the move, it has been sitting idly in my office for some time. I had been hoping to replace it, but there are always issues that come up that have a higher priority.

I built a VM on my current machine, and ran the disk checks on the Fedora DVD to make sure it had burned correctly. Then I put it in my personal laptop.

Fedora Lives!

Ok, so now that means that the harddrive in my laptop is gone. So I found a replacement and ordered that.

</rabbit hole>

Now back to the tower. I wanted to get my data off the tower, so I copied everything off.

Bam, machine died.


I mounded a local disk, and copied off everything I needed, then did the Fedora install.

Fedora, up and running!

Bam, machine died.

Ok, let's check some stuff.

Deutsch: Kingston KHX1600c9d3lI tested the memory, and during the test.

Bam, machine died.

Ok, so this machine needs new memory.

Off to Best Buy to get new memory.

While I was at Best Buy, I saw that they also had ddr3 laptop memory. Since I have to replace the hard drive for my laptop, why not do a memory upgrade?

I got the ddr3 for the tower and the laptop, then replaced the memory in the tower.


It's Alive!

Now I can do a Yum update since the machine will be up for a while.

Yum is being deprecated in favor of a tool called DNF.

So DNF Update it is.

Once current, then I need to set this up as a non-gui machine. This is called initlevel 3 in prior versions of Fedora/Red Hat.

Now there is a new tool called systemctl. In order to switch over to more of a server configuration the command is:

systemctl isolate multi-user.target 

Ok, the Tower is now in good shape, so when my laptop harddrive arrived I replaced the hard drive, and added the new 8Gig stick, then installed Fedora.

Up and running after install, the laptop locks up periodically after a few minutes.

Nothing will launch, and it even loses connectivity.


I started to lose faith in Fedora, and try Ubuntu. When Ubuntu live booted, it had memtest86+ on the initial screen. I ran the memory test, and it immediately rebooted.

Let's try that again.

Same result.

Let's assume the memory is bad, and put the old memory back in.

Memory test again?

Now I have Fedora running on my laptop and Tower.

I will be going through migrating some of my R code now onto these machines.

While I am familiar with troubleshooting hardware issues, I am glad this is not something that I do on a daily basis.

It seems like this can be a constant rabbit hole in that, when you uncover one problem, often it exposes a new one.