To discover and to protect society against Organised Crime, it is necessary to integrate information from numerous sources (catalogues, reports, geodata, forums/blogs, social networks, etc.). These sources are characterised by a considerable heterogeneity w.r.t. data format and structures, coverage, quality and means of access. The Linked Data paradigm enables efficient and effective networking and integration of heterogeneous, distributed data. LiDaKrA aims at a holistic approach to extract, network and fuse crime-relevant information from public and private sources such as:
- the Web in general, the Social Web (social networks, blogs or wikis), Deep Web (eCommerce databases such as ebay or Amazon Marketplace), Dark Web (informations from the Tor network), Data Web (open data such as DBpedia or GeoNames)
- public databases (e.g. trade register, credit bureaus such as Genios or Bürgel)
- shared internal catalogues (e.g. of the catalogue editorial department of the German Federal Criminal Police Office BKA, the PEP list of the EU, or WorldCheck)
- databases of investigation authorities
On top of an on-demand integration of such data we will employ a combination of machine learning approaches, as well as statistical, rule-based and semantic techniques for big data analysis to answer questions such as:
- What patterns identify characteristic modi operandi of organised crime on the Internet?
- How can one search for such patterns in a targeted way; how can they be predicted?
- Which of the information sources used provides the most useful information?
- How can one model evolution and dynamics of characteristic patterns?
The technical components will be implemented in an integrated platform and will be evaluated in concrete use cases with multiple stakeholders from crime investigation authorities