Overview[]
The DHS Data Framework is composed of three pilot initiatives: the Cerberus Pilot, the Common Entity Index Prototype (CEI Prototype), and the Neptune Pilot. These initiatives use data tags to apply policy-based rules to determine which users can access which data for what purpose, so DHS can share its information internally while ensuring that robust policy and technical controls are in place to protect privacy.[1]
Elements for controlling data[]
Tthe DHS Data Framework defines four elements for controlling data:
(1) User attributeS identify characteristics about the user requesting access such as organization, clearance and training; (2) Data tags label the data with the type of data involved, where the data originated and when it was ingested; (3) Context combines what type of search and analysis can be conducted (function), with the purpose for which data can be used (authorized purpose); and (4) Dynamic access control policies evaluate user attributes, data tags and context to grant or deny access to DHS data in the repository based on legal authorities and appropriate policies of the Department.
Through the DHS Data Framework, additional granularity is gained by controlling access based on the role and the specific function of persons conducting the search, the type of search, the purpose of the search and ultimately the authority the person has to search. Being able to harness this detail and audit it is a benefit not only to the security of the system, but also to the privacy of the data subjects.
Perceived data risks[]
There are six data risks that are important to address:
- New Uses of the Data: As greater analytical capabilities are brought to bear on the data warehouse, a new segment of uses may be discovered that is outside the stated purposes of the current privacy notices.
- Privacy Notices: While the language of notices is written very broadly to accommodate many unanticipated uses, it is this uncertainty regarding potential new uses that could bring less transparency to a process and/or government functions that are designed to be the opposite.
- Format of the Notices: As uses of the data change over time or new, unanticipated uses become a reality, the format of initial and changed privacy notices becomes more important to provide transparency to the person whose data is being used. In many cases notices are only provided at the time of collection as opposed to when additional technology enhancements are introduced or new uses are made of the data.
- End User: As the population to which the privacy notices change or the status of a person changes under the notice provisions at the time, differences in the rights or protections afforded persons could change. As data is combined or introduced from multiple databases into one centralized repository, the legal status for each data point, the use or the laws protecting the data could be different. More specifically, the rights of U.S. and non-U.S. Persons may affect the initial notice and subsequent changes.
- Onward Uses: As it may be possible that users outside DHS gain access to this data through enhanced searching and a new use that was not previously considered comes about, it is important to ensure that this new user comports with the original DHS law enforcement and terrorism remits and with privacy notices. By being very specific and transparent about the uses, DHS may be able to avoid future problems.
- Redress: [I]n a centralized data repository model the opportunity for incorrect data and processing to be present exists. Without providing some means of redress to correct the underlying system, the data repository will continue to propagate this incorrect data in future searches.
References[]
Source[]
- "Elements for controlling data" and "Perceived data risks" sections: Data Privacy and Integrity Advisory Committee Recommendations Paper 2014-01, at 3-5.