Data.World Project – Examining Public Employee Compensation in San Francisco

It has been a while since I last began playing around with data.world, an awesome collaborative website where users can share data sets to draw insights.

I chose to take public compensation records from the City of San Francisco, mostly because public employee data is one of the easier bits of information to get a hold of.

To be honest, I really have no hypothesis or operating assumptions – I’m just seeing what happens, so to speak.

Methods

The first thing I did was download 2013-2017 San Francisco employee compensation records from data.gov, a US run website created in 2009 as an effort towards transparency. Fortunately the information was pretty clean, so no scrubbing was necessary. I then imported the CSV into data.world site as a new project. Data.world has a very cool query builder that allows you to leverage SQL.

I noticed that the most standardized field for categorizing job group was rolled up to a ‘Organizational_Group’ , which fit the entire data set in 6 buckets

Public Protection
Public Works, Transportation & Commerce
Culture & Recreation
Human Welfare & Neighborhood Development
General Administration & Finance
Community Health

I thought it would be useful to see the average salary of each organizational group, here is for 2017

Year Organization Group Average Salary
2017 Community Health $102,846.31
2017 Public Protection $141,707.62
2017 Public Works, Transportation & Commerce $104,978.49
2017 Culture & Recreation $50,408.47
2017 Human Welfare & Neighborhood Development $71,870.05
2017 General Administration & Finance $97,371.90

I thought this was kind-of useful, but it would be cool if I could see what the average salary was for each organizational group, year-over-year. I published the following ‘insight’, which can be shared with followers on data.world

download-2018-12-17T06-19-03-568Z

It looks like while public protection enjoyed, on average, the higher salary or ‘total compensation’, which includes benefits – it had slightly dropped year over year, compared to Community Health, Public Works, and General Administration & Finance which has been growing slowly over time.

My next question was based on the total amount spent on employee compensation by organizational group, year after year.

Here is what I found:

ExcelSalary

Okay, okay… that was done in excel…

THIS, however was another insight published on data.world

download-2018-12-17T06-23-41-610Z

This insight, I hope, shows that each organizational group has taken about the same percentage of the pie year over year, with the most drastic decrease for Public Protection and an increase in Community Health.

What have I proven? Probably nothing, but it was fun to play ‘data scientist’ for a bit!

Make your own profile and follow me on data.world!

https://data.world/gongstad

Advertisements

How to Study for The Salesforce Platform Developer Exam

sf dev

 

If you’re like me, the first thing you do when you decide to go for something is to peruse the internet for advice from people that have done it. That was what I did when it came to both my Salesforce Administrator and Salesforce Platform Developer I certifications.

When I was up late after work studying for the Platform Developer I exam, I would catch myself half thinking/half praying ‘God, if I pass this test I promise to help others!’ Part of the reason I’m writing this now is because although I found it difficult to find some of the answers I was looking for, I did find many valuable articles from people who had ‘blazed the trail’ before me.

This is my attempt to give back.

I wouldn’t consider myself very technically experienced. I have my Bachelor’s degree in English Literature and up until 2014, I’d never opened up an excel file.

I do, however, possess a mindset that wants to solve problems and enjoys thinking through them – if you can keep that mindset, you’re on the way to passing the Developer Exam.

1.      Use the Salesforce Study Guide to plan

This is important. My first mistake was quickly veering off track from the study guide and down the rabbit hole of development documentation and YouTube videos of Dreamforce speakers. At one point I was attempting to actually read the entire Apex Development Workbook (now deprecated), making stacks of flash cards about Rest API, asynchronous callouts, and memory allocation limits.

If you don’t know what most of that is, don’t worry you won’t need to for this exam. Use the Salesforce Study Guide, and what’s on the Salesforce Study Guide as the foundation of your studying. Download the guide here.

 Begin by reading the study guide over and over until you know what is in and out of the scope of the exam. Next, write down each section of the exam in order from least familiar to most familiar. Draw a line dividing the top half of the list from the bottom and apply an 80/20 rule – plan to spend 80% of your time studying in the top half (the less familiar) and 20% of your time on the remainder.

2.      Take the Developer Beginner Trail from Trailhead

Besides experience, Trailhead is the best resource out there for learning Salesforce. If you are unfamiliar with Trailhead, it’s a series of guided learning paths for business users to developers where you can earn badges, track progress and plan your learning curriculum.

If you have not yet signed up, do yourself a favor and do it now! https://trailhead.salesforce.com/en/trails. You will need a developer org if you want to follow along with most of the training. You can get one here.

3.      Track by Comprehension, Not by Time

Remember, quality over quantity! I would avoid planning in time chunks. Just because more time is spent studying, does not mean more is learned. Focus on what specific thing you want to know when you’re finished with a study session.

For example, looking at section five of the Study Guide you might take ‘Describe how to use basic SOSL, SOQL, and DML statements when working with objects in Apex’ under the Logic and Process Automation and decide to learn the how to write three types of SOQL statements in one session and the limitations of SOSL statements in another. The best part about doing this is that you will hopefully leave every session feeling accomplished

4.      Utilize Different Methods of Learning

Spend time in Salesforce Documentation reading, Trailhead and YouTube learning, and a development org doing! I know I have learned something when I can explain it and when I can think of ways to use it. Take what you want to learn – Read about it, watch someone do it, and finally do it yourself.

5.      Know Why and When

Most of the exam consists of scenario based questions. You’ll never pass the exam if you only know terms and their definitions. After reading and learning about a functionality, I ask myself why I would need it and when I would use it. Why should one use process builder over a workflow? Come up for a scenario or use case for both!

6.      Utilize Salesforce Training

If your company has premier support, you have access to a hub of instructor led training videos! Navigate to ‘Help & Training’ in your org and look for the training section. Don’t sweat it if you do not have premier support, there is still plenty of other information out there.

7.      Salesforce Ben

Salesforceben.com is an awesome resource that I have used for both of my certification. There is a great article about the Platform Developer I certification here: http://www.salesforceben.com/platform-developer-certification-guide-tips/. They give an awesome breakdown of the exam.

8.      Answer and Ask Questions in the Support Forums

Even if you’re not an expert in Salesforce Development yet, try to help others with their problems. It is a great way to receive use cases and scenarios that can provide you with experience and hands on learning! You probably won’t receive the vote for the ‘Best Answer’ all the time, but it begins the process of actively thinking about how to implement development solutions .

9.      Don’t Burn Yourself Out

No need to pull an all-nighter! Give yourself plenty of time, you can only take in so much at a once. If you have accomplished your ‘learning goal’ for the session, take the rest of the day off! Go outside and enjoy some nature.

I hope this article has provided some good advice you can utilize to prepare for the Platform Developer I and other Salesforce exams. Good luck, you’ll do great!

Excel TIP (Add Data Validation with a Drop-Down List)

 

It is important to be mindful of how you are entering repetitive data in excel. There is nothing worse than trying to validate and modify a column that has the same value written multiple ways! Look at this contact information, where the same state is recorded differently.

texas2

If you needed to filter this list on state you’d have to account for all representations of Texas!

Adding drop down validation is a great way to ensure values will be recorded consistently – and it is super easy to do in excel!

Let’s say Maggie’s Pet Supply sales team is calling a list of pet stores. One of the items they need to capture is the product they are interested in ordering

4

Rather than rely on the sales team to capture the correct product name, we can store the list of products as a validation list that can be selected by a drop-down.

  • To do this, add a second sheet to store the list of possible products

3

  • Next, highlight the column you want to be able to select the drop-down items
  • Under the Home tab, select ‘Data Validation’

5

  • For Validation criteria, allow ‘List’

6

  • Click in the Source box and then Click on the Product sheet
  • Highlight the values in the list and select OK

7

  • Navigate back to the Call Sheet and you can now select items for your drop down!

8

 

 

Excel Trick: Using Formulas and Formatting to View Repetitive Data

rawpixel-com-296621-unsplash

With the approval of the good friend and colleague who first showed me, I wanted to share a helpful way to view repetitive data. Let’s assume that you are working with a list of companies and their associated contacts. After ordering the list by Company and scrolling down the list, it is easy to lose track of what you are looking at due to the repetitive nature of the data.

rep data

Try out this mixture of formulas and conditional formatting!

Before you begin, you’ll need a value that can identify each company which is present in each row, in this case I am using Company Name but you could also use Website, or a Company Number if available. You can use the data at the bottom of this post to follow along.

  • In the first row after the last column of the sheet, put a ‘0’. In this example, this will be in column J

0

  • In the cell below, write the following formula =if(A2=A1,J1,J1+1) and press ENTER

form1

  • For the sake of the example, the 0 will be in column J, and column A will store our Company Name
  • Here, we are saying IF the Company Name in this row equals the Company Name above it, return the value in J1, in this case a 0 and if not, return J1 + 1, in this case 0+1
  • Continue the formula all the way down
    • The formula knows to increment the values based on it’s current row’s values
    • The result is that the same company has the same value in Column J and when the Company Name changes, the value increments by one! Cool, huh?

continue

  • Next, we will nest our IF formula inside another, like a formula Inception without Tom Hardy or Leonardo DiCaprio. We will use the MOD function.
    • To do this, click into J1 where the formula resides. Add ‘MOD’ after the = sign followed by an open parenthesis.

MOD pt 1

  • At the end for the formula, add a comma followed by the number ‘2’ and a closed parenthesis

MOD pt 2

  • Here, we are telling excel to divide the number in J by 2 and return the remainder. A 2 will have a remainder of 0 when divided by 2 and a 3 will have a remainder of 1.
  • Drag the formula down the column
  • You’ll notice that each cell in Column J has either a 0 or 1

0 and 1

Great! We have the foundation to put some conditional formatting!

  • Select everything in the sheet and navigate to Conditional Formatting, under the Home Menu.
    • Select ‘New Rule’

Conditional

  • Select ‘Use Formula to Determine Which Cell to Format’

condi2

  • Under the Rule Description, enter the following Formula
    • =$J1=1
  • Select ‘Format’, then ‘Fill’ and select any color

fill

  • Click ‘OK’ and ‘Ok’ again
  • See you data color organized!

Oranize

It may seem like a bunch of steps now, but after some practice you’ll be able to recreate quickly. I still find many cases where this comes in handy!

Example Data:

Company Name Website Address City State Country Postal Code Contact Name Email 0
Intelligence Network Committee http://www.theintelcom.com 654 Dangerzone Iceville TX United States 829918 John Erich john@theintelcom.com 1
Intelligence Network Committee http://www.theintelcom.com 654 Dangerzone Iceville TX United States 829918 Grant Christian grant@theintelcom.com 1
Intelligence Network Committee http://www.theintelcom.com 654 Dangerzone Iceville TX United States 829918 Jeff Gulder jeff@theintelcom.com 1
Intelligence Network Committee http://www.theintelcom.com 654 Dangerzone Iceville TX United States 829918 Brain Burke brain@theintelcom.com 1
Sales Zone http://www.szone.net 718 Winner Los Angeles CA United States 90210 Karen Lyons klyons@thesalezone.com 0
Sales Zone http://www.szone.net 718 Winner Los Angeles CA United States 90210 Jeff Lyons jlyons@thesalezone.com 0
Sales Zone http://www.szone.net 718 Winner Los Angeles CA United States 90210 Sandy Hookshank shookshank@thesalezone.com 0
Sales Zone http://www.szone.net 718 Winner Los Angeles CA United States 90210 Johnny Boy jboy@thesalezone.com 0
Sales Zone http://www.szone.net 718 Winner Los Angeles CA United States 90210 Pupper Doggo pdoggo@thesalezone.com 0
The Mobile Phone Store http://www.mphonestore.com 123 Fake Street Fakevill KY United States 92011 Goldi Sampson goldi@mobilephonestore.com 1
The Mobile Phone Store http://www.mphonestore.com 123 Fake Street Fakevill KY United States 92011 Aaron Sampson aaron.sampson@mobilephonestore.com 1
The Office Gentleman http://www.theofficegentleman.org 555 Example Street Mainville CA United States 99221 Grant Ongstad grant.ongstad@theofficegentleman.org 0
The Office Gentleman http://www.theofficegentleman.org 555 Example Street Mainville CA United States 99221 Sarah Connor sarah.connor@theofficegentleman.org 0
The Office Gentleman http://www.theofficegentleman.org 555 Example Street Mainville CA United States 99221 Maggie May maggie.may@theofficegentleman.org 0
Tim’s Tool Shack http://www.timstoolshack.net 829 Rochester Way New York NY United States 291 Tim Anderson Tim@timstoolshack.com 1
Tom’s Baseball http://www.tomsbaseballstore.com 705 Mainstreet Hoopville IN United States 77266 Tom Johnson tjohnson@tbaseballstore.com 0

To Forget or Not to Forget: An Examination of the GDPR ‘Right to be Forgotten’

dhruv-deshmukh-266273-unsplash

 

The architects of GDPR stress their intention of the regulation: to increase both individual privacy and innovation. If innovation includes finding ways to be exempt from GDPR, they would be right.

In a growing consumer marketplace that heavily relies on massive amounts of data, it only makes sense that the most realistic approach to compliance will be to find ways to fit through its ‘loopholes’. At the thousand-mile level the regulation is innocuous enough: individuals must be aware of how their data is being used by giving consent and if they choose to, they can request that their personal data be completely removed from further ‘processing’.

A closer look of the regulation lends itself to a few scary sections – especially for data driven industries – which are followed by rather vague exemptions.

To Forget

Business and other organization are increasingly finding ‘secondary’ uses of data – That is to say, data that collected for one purpose, later ends up fulfilling the need of another purpose. An example would be if an online retail company collected address information for shipping purposes, and later ran models on all address data to determine where frequent buyers reside. Under the GDPR’s ‘Right to be forgotten’ – there are some obstacles to this:

  1. “The data subject shall have the right to obtain from the controller the erasure of personal data concerning him or her without undue delay and the controller shall have the obligation to erase personal data without undue delay where one of the following grounds applies:
  2. the personal data are no longer necessary in relation to the purposes for which they were collected or otherwise processed” (GDPR, Article 17.1)

We may be able to understand quite easily, the concept of consent. What is unclear, however, are the events where obligation becomes the condition for erasure. As mentioned previously in the case of the online retail company, would the data have an obligation to be erased? Here we have a condition where the data is no longer is needed for the original purpose but remains very valuable.

Of course, the company may foresee the need for the data and include something in the written consent along the lines of ‘this information will be used for shipping purposes and general marketing reasons’ but this could be a violation of the law’s definition of consent “‘consent’ of the data subject means any freely given, specific, informed and unambiguous “ (GDPR, Article 4.11).

A more concrete example may be the tremendous value of Google search queries. Google Trends, is a way to visually see in infographics how people are using the search engine – what they are searching, what news they are looking for, and what is interesting to them. In 2006, AOL released over 20 Million search queries. Each user remained ‘anonymous’ by substituting their name for a unique ID.

An article written by The New York Times reported that the identities of some users were able to be discerned based on search history, leading to AOL removing the information (Barbaro, Zeller 2006). This is a case of seemingly general data revealing a personal identity. Even though Google Trends represents a mind numbingly large amount of data being aggregated, would it not be possible to discern an identity from it? Furthermore, can the aggregation of each trend ever be considered the “purpose(s) for which they were collected or otherwise processed” (GDPR, Article 17.1)?

 

Or Not To Forget?

The architects of the GDPR accounted for reasons to continue processing data past its original purpose and allowed for a variety of exceptions. One of such exceptions are in cases of law compliance, essentially leaving all government agencies exempt (as if this was any surprise). Other conditions relate to the public value of the data stated in Article 89 which allows exceptions for “Safeguards and derogations related to processing for archiving purposes in the public interest, scientific or historical purposes or research purposes”.

This would also seem to exempt government sponsored research such as anthropological and other population based research as well as medical and scientific research. For a great read about the research exceptions, check out this article from the International Association of Privacy Professionals: https://iapp.org/news/a/how-gdpr-changes-the-rules-for-research. Thus the main innovation that would come from the law may be for business to find a way to fall into an exception category by expressing a reasonable need for data retention after its initial use.

Consider for example, another use of search query aggregation where Google claimed that they could use the information to locate the spread of the flu virus by analyzing user’s symptom searches. An article from the Guardian, notes that “They also found that the Google statistics, which can be gathered daily, were up to two weeks ahead of the federal government’s data, which took time to assemble because it came from so many doctors” (Pilkington, Google Predicts use of Flue using huge search data). Under the GDPR regulation, this specific use case may qualify as being exempt. However, it is highly unlikely that Google could have foreseen the exact use of its query data. Had GDPR come a few years earlier – this incredibly valuable analysis may have never come to light.

Although regulations and their interpretation have a way of veering in different directions from each other, it will be interesting to see how GDPR will be enforced, what exceptions or exemptions will be made, and how companies, especially ones that rely heavily on large amounts of data will adapt.

 

 

 

 

Sources:

Barbaro, Michael and Tom Zeller Jr. “A Face Is Exposed for AOL Searcher No. 4417749” The New York Times 9th  August 2006. https://www.nytimes.com/2006/08/09/technology/09aol.html

Maldoff, Gabe. “How GDPR changes the rules for research” iapp.org. 2018. https://iapp.org/news/a/how-gdpr-changes-the-rules-for-research/

General Data Protection Regulation. https://www.eugdpr.org/

Photo by Dhruv Deshmukh on Unsplash

The BSA’s of The Business Systems Analyst

rawpixel-com-565456-unsplash

My job title can sound pretty vague, ‘Business Systems Analyst’ sounds like a few important words strung together. To those outside of the software development world, and perhaps even to those inside, it’s anyone’s guess to what that means. Whenever I’m asked ‘what I do’ , I typically respond one of three ways:

“I am in Software Development”

“I am in IT”

“I wear many hats…”

All are true, but none of them – even when combined, paint a complete picture.  I did a bit of research to find out how others would describe the role of the Business Systems Analyst. I found some pretty insightful things, here is a video from the Technology Profession YouTube channel that does a great job highlighting the job responsibilities as well as the main skills needed. For me, the easiest way to define the role is to break it up into three fundamental components that are already included in the title: Business Systems Analyst.

is for Business

The first, and most important of the three. This is who the Business Systems Analyst serves. The BSA must know the Business, it’s mission, and it’s goals, including an understanding of how the business generates revenue, how the business positions themselves within their market and the overall growth strategy.

I often find myself going down a rabbit hole of tasks and emails before bringing myself to and thinking how does what I am doing effect the goals or mission of my organization?

is for Systems

This is the technological arm of the BSA. Each business relies on systems to successfully deliver their product or service to the customer. A BSA must understand the main systems that the business unit utilizes including the most common use cases, their limitations, and gaps.

The BSA is also responsible for knowing the data each system relies on, how that data interacts with other systems, and the importance of the data.

Perhaps equally as important as knowing current systems, the BSA will need to know how to develop and manage new systems. To do this, he or she should understand the software development life cycle and be relatively up to date with current technologies and trends in their space.

is for Analyst

This is the creative problem solving arm of the BSA. Ultimately, a BSA’s key value proposition is in their ability to implement and come up with new solutions.

Understanding the current state is a fundamental part of the role and the other part is discovering how to improve. The BSA continually challenges the way things are done and looks for ways to optimize.

On the road to optimization, the BSA collaborates with the business, developers, and stakeholders to create a wonderful and hopefully adopted solution.

The Business Systems Analyst is a highly enjoyable and rewarding role. The BSA gets to collaborate with many different business units, take part in the creativity of the development team and contribute to the goals of the business.

If you’re a BSA or work with BSA’s what are some other things you think are important to the role?