crime prevention – Dataconomy https://dataconomy.ru Bridging the gap between technology and business Wed, 28 Nov 2018 10:53:03 +0000 en-US hourly 1 https://dataconomy.ru/wp-content/uploads/2025/01/DC_icon-75x75.png crime prevention – Dataconomy https://dataconomy.ru 32 32 How Effective Is AI Crime Prediction? Evaluating Our London Crime Prediction Model https://dataconomy.ru/2018/11/21/ai-crime-prediction-dataiku/ https://dataconomy.ru/2018/11/21/ai-crime-prediction-dataiku/#respond Wed, 21 Nov 2018 14:53:38 +0000 https://dataconomy.ru/?p=20526 Last year, we set up a prediction model on crime in London. We had established the model already, grounded in open data, but updated it to make predictions about 2017. We took the data provided by the police in the greater London area, and by enriching this data with Points Of Interest from Ordnance Survey […]]]>

Last year, we set up a prediction model on crime in London. We had established the model already, grounded in open data, but updated it to make predictions about 2017. We took the data provided by the police in the greater London area, and by enriching this data with Points Of Interest from Ordnance Survey and UK Census data, we created multiple predictive models with Dataiku in order to give these predictions for 2017 at the local LSOA level.

The model was reasonably accurate, considering we only had access to open data (taking into account the level of control we have on open data models). But let’s break down how we established our model’s performance.

Building A Data Preparation Flow for Aggregating on Monthly Observations vs Predictions

The first step was to collect the 2017 police data. I just downloaded the data (fairly) manually onto my computer. The partitioning system of Dataiku adjusts to fit the collected files’ structure, which is how Dataiku controls inserting and updating dataset rows into a meaningful organized structure. Partitioning also helps automate the recurring tasks implied by big data usage.
I partitioned the data based on the month, which helps automate our workflow. Here’s what the updated project flow looked like:How Effective Is AI Crime Prediction? Evaluating Our London Crime Prediction Model

After preparing our data and joining the predictions and the LSOA boundaries, we could compare our predictions to the real observed data. To do that, I computed the residuals, numerical differences between the values as prediction – observed_crimes:

How Effective Is AI Crime Prediction? Evaluating Our London Crime Prediction Model

Now we could compute different indicators and reshape our data in order to analyse the predictions:

How Effective Is AI Crime Prediction? Evaluating Our London Crime Prediction Model

Graphically Establishing Our Model Performance

It was then time to generate some insights into the process, which are useful for analyzing the data. On a monthly basis, our R^2metric—an essential reading on model accuracy—is 0.88, which is fair especially when using limited datasets. When we look at the full year, the R^2 metric for LSOA is 0.95, which is better than expected, with a global prediction error of 8.7%.

How Effective Is AI Crime Prediction? Evaluating Our London Crime Prediction Model

The areas (LSOA) where the residuals are the highest are blue (when the prediction was lower than the actual) and red (when the prediction was higher) on the map. As expected, the predictive model tends to underestimate the number of crimes, which is particularly apparent in the London city centre.

How Effective Is AI Crime Prediction? Evaluating Our London Crime Prediction Model

To break down the model performance for 2017:

  • Median Average Error (MedAE): 19 crimes
  • Global fit: 95%
  • Difference vs reality 8.7%

How Effective Is AI Crime Prediction? Evaluating Our London Crime Prediction Model

On a monthly basis, extreme residuals were observed in June, September, and December. This highlighted some limitations of the model. One way to improve it would be to add some features related to public events or weather conditions.

How Effective Is AI Crime Prediction? Evaluating Our London Crime Prediction Model

Let’s Automate Everything!

I created a scenario that will calculate the accuracy statistics for the previous month. Here are the steps:

1. Build the joined observed data and predictions for the previous month (or a specific month as a partition passed by the DSS API if the data are delivered later, this can be easily automated thanks to the DSS Public API
2. Refresh a Jupyter notebook containing some charts and metrics.
3. Build others datasets and refresh the charts cache for sharing the updated insights in a dashboard.

How Effective Is AI Crime Prediction? Evaluating Our London Crime Prediction Model

Developing the monthly automation only took a couple of hours at an airport, since Dataiku makes it really easy to push predictive projects into production. This way you don’t have to keep updating them manually, always have up-to-date projects, and most importantly, you retain control of your predictive models. Learn more about pushing analytics into production with our free white paper.

Dataiku will be presenting at Data Natives– the data-driven conference of the future, hosted in Dataconomy’s hometown of Berlin. On the 22nd & 23rd November, 110 speakers and 1,600 attendees will come together to explore the tech of tomorrow, including AI, big data, blockchain, and more. As well as two days of inspiring talks, Data Natives will also bring informative workshops, satellite events, art installations and food to our data-driven community, promising an immersive experience in the tech of tomorrow.

]]>
https://dataconomy.ru/2018/11/21/ai-crime-prediction-dataiku/feed/ 0
Ethics to Ecotech: 5 Unmissable Talks At Data Natives 2018 https://dataconomy.ru/2018/10/25/data-natives-2018-best-talks/ https://dataconomy.ru/2018/10/25/data-natives-2018-best-talks/#respond Thu, 25 Oct 2018 13:20:56 +0000 https://dataconomy.ru/?p=20466 The pace of life and industry is accelerating at an unprecedented rate. Interconnected tech, inconceivably fast data processing capabilities and sophisticated methods of using this data all mean that we’re living in fast-forward. The Data Natives Conference 2018 will be exploring life at an accelerated pace, and what rapid innovation means for cutting-edge tech (blockchain, […]]]>

The pace of life and industry is accelerating at an unprecedented rate. Interconnected tech, inconceivably fast data processing capabilities and sophisticated methods of using this data all mean that we’re living in fast-forward. The Data Natives Conference 2018 will be exploring life at an accelerated pace, and what rapid innovation means for cutting-edge tech (blockchain, big data analytics, AI) across industries.

From governments to genomic projects, the quickening of life, work and research impacts every industry- and Data Natives 2018 offers two intense days of workshops, panels and talks to explore this impact. With more than 100 speakers presenting over 48 hours, the breadth of expertise at DN18 is vast; luckily, we’re here to help you curate your conference experience. The Data Natives content team have selected six talks that perfectly encapsulate this year’s topic and focus- trust us, these are six presentations you can’t afford to miss!

Ethics to Ecotech: 5 Unmissable Talks At Data Natives 2018
Image: Supper und Supper

1. A 21st Century Paradox: Could Tech Be the Answer to Climate Change?

Climate change is one of the greatest concerns of our lifetime- and many are wondering if technology holds the answer to decelerating the impending climate disaster. Dr. Patrick Vetter of Supper und Supper will be presenting one use case which demonstrates the tangible benefits of ecotech: “Wind Turbine Segmentation in Satellite Images Using Deep Learning”. In layman’s terms, Dr. Vetter will be sharing the details of his project to optimise wind turbine placement using deep learning and analysis on “wind energy potential”. Exploring the potential of rapidly accelerating data technologies to curb the rapid acceleration of climate change, Dr. Vetter’s talk is definitely one to watch.

2. Cutting Through Propaganda: Government Policy Priorities in Practice

Any citizen of a democracy knows that there’s usually a huge gulf between the promises made in government officials’ election manifestos and what actually becomes policy. Cutting through the propaganda, is it possible to find a quantitative measure of the government’s priorities (and how they shift) over time? American Enterprise Institute Research Fellow Weifeng Zhong has been working on just such a measure: the Policy Change Index (PCI). Running machine learning algorithms on the People’s Daily, the official newspaper of the Communist Party of China, Zhong has found a way to infer significant shifts in policy direction. The PCI currently spans the past 60+ years of Chinese history- through the Great Leap Forward, the Cultural Revolution, and the economic reform program- and can now also make short-term predictions about China’s future policy directions. Zhong will be allowing us to glimpse under the hood of the PCI at Data Natives 2018, as well as sharing some of the more remarkable findings with us.

Ethics to Ecotech: 5 Unmissable Talks At Data Natives 2018

3. Blockchain: Beyond a Buzzword

Over our four editions of Data Natives, we’ve seen blockchain emerge from a promising but niche sphere into a full-blown game-changing technology. However, blockchain and decentralised computing are still shrouded in hype, and have a long way to go to garner full consumer trust. That’s where Elke Kunde, Solution Architect, Blockchain Technical Focalpoint DACH at IBM Deutschland, comes in. Her talk on “Blockchain in Practice” at Data Natives 2018 aims to demystify blockchain, slash through the hype, and enlighten the audience about how IBM clients are already using decentralised computing in their tech projects. This talk is a must-see for anyone who’s excited by the promise of blockchain, but still unclear on how exactly decentralisation can change the tech game- and their business- forever.

4. Using Machine Learning to Predict (and Hopefully Prevent) Crime

Predictive policing has been a hot topic for many years- and the technical methods behind it have become more sophisticated than ever before. Du Phan, a Data Scientist at Dataiku, will walk DN18 attendees through one particularly sophisticated model, which uses a variety of techniques including PostGIS, spatial mapping, time-series analyses, dimensionality reduction, and machine learning. As well as discussing how to visualise and model the multi-dimensional dataset, Phan will also discuss the ethical principles behind predictive policing- and what we can do to prevent crime rather than predict it.

Ethics to Ecotech: 5 Unmissable Talks At Data Natives 2018
Image: jeniferlynmorone.com

5. Putting a Price on Personal Data

Data privacy and the price of personal data have been hot topics for years, coming to a boil with events such as the Cambridge Analytica scandal. Even Angela Merkel has declared that putting a price on personal data is “essential to ensure a fair world”- but how do we put a price on data, and how can this be enforced? Jennifer Lyn Morone- the artist who registered herself as a corporation and sold dossiers of her personal data in an art gallery- will discuss her perspective on these issues in a closing keynote for Data Natives which will bring the ethics of data science into focus.


Ethics to Ecotech: 5 Unmissable Talks At Data Natives 2018

Data Natives will take place on the 22nd and 23rd November at Kuhlhaus Berlin. For tickets and more information, please visit datanatives.io. 

]]>
https://dataconomy.ru/2018/10/25/data-natives-2018-best-talks/feed/ 0
Data Science Leveraged to Stop Human Trafficking https://dataconomy.ru/2016/05/23/data-science-leveraged-stop-human-trafficking/ https://dataconomy.ru/2016/05/23/data-science-leveraged-stop-human-trafficking/#respond Mon, 23 May 2016 08:00:22 +0000 https://dataconomy.ru/?p=15787 Finding missing children and unraveling the complex web of human trafficking is no easy task. The relevant datasets are massive and often unstandardized. It can be difficult to find the right data at all, as it often disappears from websites and pages on a regular basis. When data is hard enough for scientists to capture […]]]>

Finding missing children and unraveling the complex web of human trafficking is no easy task. The relevant datasets are massive and often unstandardized. It can be difficult to find the right data at all, as it often disappears from websites and pages on a regular basis. When data is hard enough for scientists to capture and evaluate, how can law enforcement agencies even begin to get a handle on it? These agencies, with little funding or know-how, need real help if they want to leverage big data and get a grip on human trafficking.

Many efforts to solve crimes with data is actually coming from outside the law department. From community efforts to non-profits and even full business solutions, it seems the world of data science is actively using their skills for good. More importantly, these data solutions are in stark contrast to the more general and vague job of crime prediction, which is becoming more and more common. Many departments already use data to target trouble areas, but for those crimes that involve huge rings and layers of corruption, there’s a lot more work to be done.

The companies using data science to stop human trafficking often use several methods and mimic what regular law enforcement agencies might do on their own. The “Science Against Slavery” Hackathon, was an all-day Hackathon aimed sharing ideas and creating science-based solutions to the problem of human trafficking. Data scientists, students and hackers honed in on data that district attorneys would otherwise never find. Many focused on automating processes so agencies could use the technology with little guidance. Some focused primarily on generating data that could lead to a conviction—which is much easier said than done. One effort from EPIK Project founder Tom Perez included creating fake listings. They could then gather information on respondents, including real world coordinates. Other plans compared photos mined from escort ads and sites to those from missing person reports. Web crawling could eventually lead to geocoding phone numbers or understanding the distribution of buyers and sellers, as well as social network analysis.

Turning Big Data Into Real World Information

Perhaps one of the more famous initiatives comes from the Polaris Project, a project that was started in 2002 and revitalized in 2012 through the use of data science. When the company heard a talk from the CEO of Palantir, a software and data analysis company, it was clear that the fight against human trafficking needed an upgrade—a big one. With some help from Palantir, Polaris was soon armed with new technology and engineers. They began leveraging data from phone calls, company contacts, legal service providers, and every other part of their organization in one simple platform.

Palantir actually helped other companies, like the National Center for Missing and Exploited Children, or NCMEC, in a similar fashion. By combining data from public and private sources, the organization pinpointed 170 different quantitative and qualitative variables per case record. Advanced analytics were required to evaluate tips, of which 31,945 came by phone, 1,669 through online submission, and 787 from SMS. The project also aimed to digitize old records that spanned several decades and import them into a single searchable analyzable structure. All of this data is powerful, but the final step was making it easily accessible. By importing the numerous formats and levels of information into one database, what once took several weeks—or was impossible entirely—could be done in an instant.

The story of one missing 17-year old girl in California has since become the shining example of data triumphing in the world of human trafficking. Using data science, analysts were able to find multiple online posts advertising the missing girl in question for sex. By analyzing over 50 ads, and nine different women spanning five states, analysts didn’t just find the girl—they saw the larger ring and were able to link the pimp to other crimes and victims.

Visualizations and Easy Solutions for Law Enforcement

The BBC has reported on the amount of data available, and how those terabytes aren’t as immediately helpful as the public would like to think. Child sex abuse raids tend to lead to unbelievable amounts of data. Image forensic specialist Johann Hoffman laments, “the problem is, how as a police officer do you go through that huge amount of data? When you are dealing with terabytes there’s no way a human could ever go through it all.” Using analytics, however, has given them an entirely new approach to data. Friendly data platforms and visualizations help generate a larger story that doesn’t require a master’s degree to understand.

There are several more examples, but one particularly interesting area are those data solutions marketed toward law enforcement. One Y combinator startup wants to act as a paid service for law enforcement. It may feel a tad weird to read a tagline like “the right data at the right time can make or break your prosecution,” but these external companies offer the expertise law enforcement employees likely won’t otherwise have access to. Plus, to make the entire concept a bit more palatable, this particular startup, Rescue Forensics, only registers with official law enforcement agencies, as opposed to just anyone who wants to pay up. Most escort advertisements disappear after a few days, making it incredibly difficult to track. Companies like these who focus entirely on data tracking, analysis and storage can keep otherwise lost information alive for those who need it.

The splintered nature of the entire field might also be one of its biggest assets, for the time being. While splintering in some sectors causes huge problems, and ultimately holds users back from progress, the array of approaches in this area is due to just how many people are interested in creating solutions. These different companies come with different backgrounds and goals and will ultimately lead to new and exciting possibilities. Many operate on open-source platforms, meaning we can expect the number of solutions to continue to skyrocket.

Like this article? Subscribe to our weekly newsletter to never miss out!

]]>
https://dataconomy.ru/2016/05/23/data-science-leveraged-stop-human-trafficking/feed/ 0
Seattle Police Just Got an Edge Over Crime with the Inception of the SeaStat Programme https://dataconomy.ru/2014/09/19/seattle-police-just-got-an-edge-over-crime-with-the-inception-of-the-seastat-programme/ https://dataconomy.ru/2014/09/19/seattle-police-just-got-an-edge-over-crime-with-the-inception-of-the-seastat-programme/#respond Fri, 19 Sep 2014 08:49:35 +0000 https://dataconomy.ru/?p=9313 Taking their cues from the LAPD, the Seattle Police Department’s latest crime data mining program, called SeaStat, will identify crime hotspots based on analysis of crime data and community reports of incidents. According to the department’s news release, there has been an up rise in crime in the Capitol Hill area which was the final […]]]>

Taking their cues from the LAPD, the Seattle Police Department’s latest crime data mining program, called SeaStat, will identify crime hotspots based on analysis of crime data and community reports of incidents.

According to the department’s news release, there has been an up rise in crime in the Capitol Hill area which was the final straw to put the program into action. SPD has boosted patrols there and is closely monitoring the area to see if crime is reduced.

Under newly-appointed chief, Kathleen O’Toole’s guidance the intention is to utilize SeaStat process to resolve issues as soon as they’re identified. “The regular meetings are intended to help department staff assess if solutions are working, and develop other strategies if they’re not,” the department says.

Communities and residents get to interact directly with the department to propose their views which are taken into account through regular meetings. The community feedback, and analysis of crime data, will help adjust the precinct community policing plans now under development.

“We’re in the crime fighting business,” quipped SPD’s Chief Operating Officer (COO) Mike Wagers. “We’ve identified the trends and are working hard with our many partners to reverse them.”

New figures released by the department on Wednesday reveal that 29,554 crimes were reported in the first eight months of 2014 as compared to 26,152 in the same time period last year, an increase of roughly 3,400 crimes, or 13 percent. At such a juncture many believe that SPD’s latest offering could be a pleasant relief.

Read more here.


(Image credit: KP Tripathi)

]]>
https://dataconomy.ru/2014/09/19/seattle-police-just-got-an-edge-over-crime-with-the-inception-of-the-seastat-programme/feed/ 0
‘Big Data is Making Australia Safer’- Using Big Data to Fight Crime https://dataconomy.ru/2014/05/13/big-data-making-australia-safer-using-big-data-fight-crime/ https://dataconomy.ru/2014/05/13/big-data-making-australia-safer-using-big-data-fight-crime/#comments Tue, 13 May 2014 09:45:09 +0000 https://dataconomy.ru/?p=4258 The future of crime fighting is moving away from reacting to incidents as they occur and towards ‘predicting’ crime in order to prevent it. The LAPD have already spoken about using earthquake models to predict crime ‘aftershocks‘; now, The Australian Crime Comission are scanning massive sets of data to examine criminal threats across the country. […]]]>

The future of crime fighting is moving away from reacting to incidents as they occur and towards ‘predicting’ crime in order to prevent it. The LAPD have already spoken about using earthquake models to predict crime ‘aftershocks‘; now, The Australian Crime Comission are scanning massive sets of data to examine criminal threats across the country. They have spent $14.5 million over the last four years developing big data systems to identify these trends, meaning they can take a more proactive approach to identifying and tackling crime.

However, speaking at the CeBIT tech conference in Sydney, ACC chief information officer Maria Milosavljevic was keen to emphasise that their work was less about the idea of predicting specific crimes, and more about examining ‘a threat that is increasing, and predicting that it is going to continue to increase based on what we’ve seen in the past’. Discussing the importance and possibilities of Big Data, she stated ‘We live in an algorithmic age, we live in an age where we have access to a lot of information and we’ve moved to a world where strategy and vision setting can be adjusted on the basis of what we can see in information’.

One advantage the ACC have found in analysing huge amounts of data is that it broadens crime fighting beyond one particular jurisdiction. By having a much wider, national view of crime patterns, they are able to identify which areas are tackling the same problems and pool their resources. Milosavljevic also stated that being able to identify threats faster and with greater accuracy means response time is shorter, and that information can be shared between partners with greater speed and efficiency.

Moving forward, the ACC are looking at how to incorporate more unstructured audio and visual data into their analysis. Milosavljevic highlighted the variety of data beyond text and spreadsheets as one of the main challenges facing the system- ‘There are some tools that allow you to do some things but it’s limited’.

 

Read more here.

(Image credit: Simon Yeo)

]]>
https://dataconomy.ru/2014/05/13/big-data-making-australia-safer-using-big-data-fight-crime/feed/ 1