Also posted at msilb7.com
Over my career, I’ve accumulated a set of skills that have helped me work faster and get more stuff done in less time. This let put more focus on the fun stuff like white-boarding new ideas and developing products. I’ve mostly kept all of this knowledge to myself — until now.
I’m sharing all of my secrets: The tools and strategies that I used to become a better analyst and PM.
Connect with me on Twitter @MSilb7. Let me know if I can help!
Excel Data Structures
Have you ever tried to analyze a dataset, only to be caught up in a heap of COUNTIF, AVERAGEIF, and column counting?
Your data tables should be designed to be as simple as possible to analyze. Ask yourself “Would a single pivot table from this dataset answer my questions?” Your answer should be yes. If not, then you may have some work to do.
As a general rule, your dimensions and partitions (e.g., product name, datetime) should determine new rows. Your values and measures (e.g., price, number of visits, rating) should determine new columns.
In some cases, I’ve used Power Query to rearrange or unpivot data in a structure that supports the pivot table that I want to create.
Whether you’re building your own tables, or transforming existing ones:
Take the time to get your data structure right up front, so that you don’t spend exponentially more time analyzing on the backend.
Image by Author — Using Power Query to unpivot data from columns to rows
Macros enable you to automate repeatable processes in Excel. This could be a data transformation (like above) that you have to do every so often, or it could be a workflow step that you want to turn into a hotkey.
Macros a good starting point, and probably your best bet if your work already lives in Excel, but you should explore other options like databases and/or python (if possible) for larger-scale projects. In my experience, Excel breaks down when you start dealing with a couple thousand of rows of data, and dealing with the visual location of data (cell layouts) can get annoying real quick.
I personally try to stay out of Excel as much as possible. My running advice (joke?) is “if you have to use Excel macros, then you shouldn’t be using Excel.” But, from my experience in roles with large companies, I am very aware that Excel may be the best or only option that you have. If it makes your work faster and your life easier, then it’s a good solution.
SQL and Databases
If you run into issues with Excel becoming slow or unusable with large datasets, or if you want your data to be shareable and easily plug into other applications, then databases may be a good path to explore.
SQL is the language to write database queries. I’d recommend looking up basic tutorials online, then experimenting with Select queries with whatever data sources you have available. I started learning SQL from querying the Baseball Databank database, which still exists!
Setting up a database may require a technical or business intelligence team to set up. I’ve used databases in two main cases, both of which required help: 1) Data tables were owned by other teams. I followed connection and permissions instructions from the business intelligence team, and then I was able to write SQL queries to work with the data; and 2) Data tables were owned by myself or my team. A software engineer set up the database and the connections, and then we worked together to start building and modifying tables.
I’ve used SQLWorkbench as an interface to query data and build tables in Redshift databases. I’ve also used pgAdmin for similar use cases in PostgreSQL databases. In my opinion, pgAdmin has a much friendlier interface for managing tables that I owned.
In previous roles, we didn’t have a business intelligence team, so my only option was Microsoft Access. I feel the struggle if that’s your case, but like I said before: “If it makes your work faster and your life easier, then it’s a good solution.” I’ll also admit that flipping back and forth between Access’ visual query builder and the code view was instrumental in helping me learn SQL.
Databases are also a core component to downstream process automation with dashboards for repeatable analysis and data sharing, and scripting languages.
This one is my biggest secret. Web automation (including web crawling, web scraping, etc) has saved me more time, and created more opportunities for me in my career than anything else.
My use cases were collecting data from websites on a recurring basis, and (similar to Excel Macros), running the same web-based processes on a monthly basis. With some set-up work up front, I was able to turn hours of clicking and copy + paste into the click of a button. It’s magical.
Side Note 1: I had issues where Firefox and Chrome would take up too much computer memory and slow everything down. If you ever have issues with performance, use the Pale Moon browser. If you still have issues, then set a 3–5 second pause every few iterations to let memory come down.
Side Note 2: From brief internet research, it looks like new versions of iMacros may have put some features behind a paywall (use old version 8.9.7). So please let me know if you find any good web automation alternatives?
Alternative Data Sources
If you’ve lived through decisions based on unfounded assumptions or finger-in-the-air guesses, then this one is for you.
We all want to make “data-driven” decisions, and that’s easy to do when the supporting data is easily available. But when the data isn’t there, that space is filled by theoretical discussions and personal anecdotes. If you can figure out how to fill-in those gaps, then you have a superpower.
In a previous role, I needed to understand emerging consumer trends, and I became obsessed with website traffic (SimilarWeb) and mobile app data (AppAnnie) as my window in. Having the data to prove or disprove any anecdotes saved so much productive time, and gave me seer-like abilities.
Whatever your industry or product is, there are hopefully data sources out there that can be used as a proxy for understanding trends, evaluating your competitive positioning, and simply figuring out what your customers want. Maybe web and app data doesn’t fit your use case, but you would be surprised what you can find by searching around for a bit. This is an area to get creative. I’ve seen some unconventional sources, like using job boards to track trending job titles (page 18).
Image by alternativedata.org: Financial-leaning alternative data sources
Dashboards for Repeatable Analysis and Data Sharing
This one is another gigantic time saver. That’s probably pretty shocking if you’re familiar with dashboards, but it goes beyond data visualizations.
My secret is using dashboards as a tool to: 1) Automate repeated analyses without having to use scripting languages, and 2) Share detailed data and customizable analyses across the team as a “single-source of truth.”
Here’s how it works:
Dashboards are often used to keep stakeholders up-to-date on the latest metrics and trends. For these dashboards, simplicity, aesthetics, and usability are likely the most important considerations. But, analysts and PMs will likely still need to pull additional cuts of data, check the datasets for errors, and run deeper analyses on the side (I did).
This extra work can take up a lot of time. Rather than saving a store of queries or templates and re-doing similar analyses over and over, what if we created our own dashboards where that work is already done? Your data output can become tables or charts in your dashboard. Your inputs, like the “where” clauses (filters) in your queries can become filter controls on your dashboard. Any parameters that you would pass into the query can also become filters or inputs on your dashboard.
Start from identifying your frequently asked questions, or frequently repeated analyses, and try to figure out what parts are constants, and what parts are variables that can be flexed out into filters or inputs. You can also share some of these internal dashboards with team members, so that they could quickly answer their own questions, and you can remove yourself as a bottleneck.
Build your dashboard once, use it often, and the time savings will build up.
Scripting Languages (Python)
Once you have a handle on breaking down processes, scripting languages let you do real workflow automation (I used Python). You can connect Python to databases for larger-scale data manipulation than what can be done in Excel. With Python and databases, you also don’t have to worry about spreadsheet aesthetics or data placement like you have to in Excel, and that also makes the code waaaaaaaay faster to run. Although, there is a bit of a mental adjustment in moving from visual spreadsheets to code.
Another benefit of using Python or any other open-source languages is that you don’t have to build everything from scratch. You can import “packages,” which are community-developed groups of functions that can be easily plugged-in to your code. These packages can help you connect your code with other applications and databases, produce projection models, run web crawls (see web automation), and more.
I used Anaconda as my environment/package manager with Jupyter Notebook to write and run code. You can install the pandas package through Anaconda (if not already) to set up your data in DataFrames (similar to tables) for easier manipulation. For a fantasy football projections side-project, I also used the sklearn (scikit learn) package which helped turn multi-variate regression into a few easy lines of code (and other machine learning applications which I haven’t explored yet).
There is a definite learning curve with all of this, and it’s super easy to get frustrated. I aggressively Googled and Stack Overflowed to figure out exactly how to do what I wanted to do (no shame). But, it’s so rewarding when you figure something out.
The Next Wave: No-Code Applications and UIs
Here’s my projection: No-Code application builders are going to take over large corporations and make most “internal tech team” work obsolete.
A few years ago, I used FileMaker to build apps for collecting data in the field (theme parks), and it *mostly* got the job done. But, FileMaker required some code, which was enough to make it inaccessible for most of the non-technical workforce. But recently, the “no-code movement” has exploded, and builders like Bubble, Webflow, and 100+ more (probably) are giving people the power to build applications completely visually, without ever touching code.
Image by bubble.io — Bubble Workflow Tab
So far, I’ve seen most no-code tools targeted at entrepreneurs and small business builders, but I think there’s a huge opportunity in the enterprise. What happens when developing internal applications and interfaces becomes as easy as setting up an Excel Workbook? Retool looks like it might fit this, but I haven’t tried it yet.
This future benefits those who can break down processes, see the problems, and build the solutions. Rather than defaulting to Macros, or requesting technical help or software engineering time, we’ll be able to easily build our own solutions that actually work.
We would enter an age of hyper-productivity, that could be really cool.
I shared the tools and strategies that I used, but it doesn’t matter how many tools you know how to use. What matters is being able to recognize when things are too slow, error-prone, or just don’t make sense; and figuring out how to make them better. Only then does it really matter.
Connect with me on Twitter @MSilb7. Let me know if I can help!