The Inner Circle

Expand all | Collapse all

Using production data in test environment

Jump to Best Answer
  • 1.  Using production data in test environment

    Posted 26 days ago
    Edited by Raji Krishnamoorthy 26 days ago

    For an U.S based Financial Services organization, is it ok to do a data masking exercise on production data and use it in the test environment?
    Will there be any compliance issues due to this?


  • 2.  RE: Using production data in test environment
    Best Answer

    Posted 25 days ago
    This would all depend on the sensitivity of the original data and the risk tolerance of the organizations. Personally, I wouldn't do it given the potential risk of exposing sensitive production data. If you can't acquire any appropriate synthetic data sets, I'd recommend creating one. If its a significant volume of data, I'd recommend reducing the target size of the data set initially to something that you could manage. As far as compliance issues, I think its difficult to say without more details on the type and sensitivity of the original data, where it currently lives vs where the test environment would be, etc.

    Based on the limited information in your question Raji, I'd advise against it.

    Lorenzo Winfrey
    Senior Solution Manager
    Rackspace Technology

  • 3.  RE: Using production data in test environment

    Posted 5 hours ago

    During my 15 years developer life we had a lot of situations where production data would be great for the test environment. Mostly the case were connections between the data and some additional metadata that helps the test process. What I typically do is data anonymization process and removal of GDPR data from the test set.

    Best regards

    Tomasz Janczewski

  • 4.  RE: Using production data in test environment

    Posted 25 days ago
    If this environment were healthcare, there would be hefty restrictions on ever using production data in testing environments, masked or otherwise.  As tightly regulated as the financial services is in the US, I can find no basis for justifying doing this, I would regard this as completely forbidden. In my own practice (security, privacy and regulatory compliance), I do not permit any client to employ this practice.  And yes, there are definite compliance issues with this:

    a.  Testing and development personnel are generally not permitted access to live production data not fully de-identified - the type of environment does not matter.
    b.  If the type of de-identification method is reversible in any way, should unauthorized personnel gain access to the environment, that combination all but guarantees a breach and data theft.
    c.  Using the rationale of trying to save money will not be accepted by regulatory authorities as justification due to the perceived unacceptable and avoidable risk.  The organization's risk acceptance/appetite/tolerance would likely be rejected due to the nature of the regulations in play and the nature of the risk itself; not to mention the existence of various acceptable alternatives.

    The notion of using production data in any form other than completely sanitized been rejected for as long as I have been working in the IT world (over 35 years).  Even recognizing that producing a test data set is somewhat troublesome (valid test results are after all rather important), I can envision no justifiable rationale to support use live data in any form other than fully deidentified and sanitized.

    Ross Leo
    Galen Data, Inc.
    Galen Data, Inc.