Research Computing

The Secure Research Environment (SRE) is a new University virtual environment designed to protect sensitive and restricted research data from misuse and unauthorized access.  The SRE is different from the current UMB research computing environment in that the computing resources, data storage and software are not located on a local desktop or laptop computer but are available in a secure Cloud infrastructure. 

The SRE minimizes risk to the institution and to the principal investigator of an unlawful exposure of sensitive data.

University of Maryland, Baltimore (UMB) 
Guidebook for the Secure Research Environment (SRE)

Introduction

It is critically important for UMB to apply a high-level of data security in protecting health-related information and other sensitive personally identifiable information like social security numbers.  A UMB Secure Research Environment (SRE) has been created to protect sensitive data used by faculty for research purposes as well as for protecting the intellectual property that develops from research studies. The use of the SRE is a mandated requirement when obtaining sensitive data provided by the University of Maryland Medical System (UMMS). The use of the SRE is recommended for research projects that contain sensitive data from other data sources. It complies with HIPAA’s standards, as well as other IT security policies and requirements, for properly securing protected health information (PHI) and personally identifiable information (PII).

New UMB Research Computing Environment

The SRE is a new University virtual environment designed to protect sensitive and restricted research data from misuse and unauthorized access.  UMB faculty researchers can focus on performing research while knowing that the data being used for research purposes are highly secured.  The SRE minimizes risk to the institution and to the principal investigator of an unlawful exposure of sensitive data.

The SRE is different from the current UMB research computing environment in that the computing resources, data storage and software are not located on a local desktop or laptop computer but are available in a secure Cloud infrastructure.  A faculty researcher simply opens a web browser, connects to their secure research environment, and sees the data and software that they need to perform research analyses.  The data are saved in the Cloud-based infrastructure.  There is no need to use the computing power of a local computer or to store data on a local machine.  It is an analogous user experience to logging in remotely to a desktop computer, where a researcher sees a personalized screen that is familiar to them.

Who do I contact to get access to the SRE?

For sensitive data provided by UMMS: EDA-Research@umm.edu

For all other sensitive data: SRE-Support@umaryland.edu

Definition of Terms Used in this Document 

Term

Definition

AVD

Azure Virtual Desktop - the virtual environment that SRE uses

Azure infrastructure

Microsoft’s cloud platform; an evolving collection of integrated cloud services spanning compute, data storage, and software applications

Cloud computing

The delivery of computing services—including servers, storage, databases, software, and analytics—over a computer network

Data steward

Person responsible for ensuring the quality, security, and fitness of the data for the purpose of the research

Egress of data

The output flow of research results

Epic

The medical data repository UMMS uses for research data

Faculty researcher

The UMB faculty member sponsoring the research project, often the Principal Investigator. The data and SRE requestor must be a faculty researcher.

HIPAA

Health Insurance Portability and Accountability Act - a federal law that required the creation of national standards to protect sensitive patient health information from being disclosed without the patient's consent or knowledge

Honest broker

The data steward for the owner of research data who acts to collect and provide that data to research investigators

ICTR

Institute for Clinical & Translational Research

IHC

Institute for Health Computing - leverages advances in network medicine, artificial intelligence (AI), and machine learning to create a premier learning health care system that evaluates both de-identified and secure digitized medical health data to improve outcomes for patients across the state of Maryland

Ingress of data

The input of research data used for analyses

Intellectual Property (IP)

Creations of the mind such as research work or collections of data analyses

PHI

Protected Health Information - a.k.a. personal health information. Examples include:

•Names

•Geographic subdivisions smaller than a state (Note: this includes ZIP code)

•Elements of dates (except year)

•Ages over 89

•Telephone numbers

•Vehicle identifiers and serial numbers, including license plate numbers

•Fax numbers

•Device identifiers and serial numbers

•Email addresses

•Web Universal Resource Locators (URLs)

•Social security numbers

•Internet Protocol (IP) addresses

•Medical record numbers

•Biometric identifiers, including finger and voice prints

•Health plan beneficiary numbers

•Full-face photographs and any comparable images

•Account numbers

•Any other unique identifying number, characteristic, or code

•Certificate/license numbers

PI

Principal investigator - the main researcher on a project

PII

Personally Identifiable Information - information that, when used alone or with other relevant data, can identify an individual. Examples include an individual’s first name or first initial and last name in combination with any one or more of these data elements:

• Social Security number, an Individual Taxpayer Identification Number, a passport number, or other identification number issued by the federal government

• A driver’s license number or State identification card number

•An account number, a credit card number, or a debit card number, in combination with any required security code, access code, or password, that permits access to an individual’s financial account

•Health information, including information about an individual’s mental health, Medical Record Number

•A username or e–mail address in combination with a password or security question and answer that permits access to an individual’s e–mail account

RIC

Research Informatics Core – the group within the ICTR that administers the UMMS-controlled research data

Sensitive data

Revealing personal data such as health related data and other types that are not meant to be made public

SRE

Secure Research Environment

UMB

University of Maryland Baltimore

UMMS

University of Maryland Medical System

Frequently Asked Questions

Who can get access to the SRE? 

Any University of Maryland, Baltimore faculty performing research can get access and use the SRE. 

 

What is the process to get access? 

SRE workflow for  principal investigators diagram

 

SRE Workflow for Principal Investigators

SRE = Microsoft Azure Secure Research Environment; RIC = UMMS Research Informatics Core;

PI = Principal Investigator; If PI is from UMD (University of Maryland, College Park), PI must obtain a UMB ID

  1. PI discusses data request with RIC
  2. PI completes ICTR form requesting data
  3. RIC interviews PI for data, storage, & computing needs
  4. RIC shares data, storage, and computing info with UMB IT
  5. UMB IT discusses SRE and potential costs with RIC and PI
  6. UMB IT onboards PI to SRE
  7. RIC moves the IRB-approved data to SRE
  8. PI begins research in SRE

SRE Workflow for Principal Investigators with NON-UMMS data Source Diagram

SRE Workflow for Principal Investigators with non-UMMS Data source

SRE = Microsoft Azure Secure Research Environment;

UMB IT = University of Maryland Information Technology Group;

PI = Principal Investigator

  1. PI contacts UMB IT to discuss data, storage, and computing needs
  2. UMB IT onboards PI to SRE
  3. PI begins research in SRE

 

When should a PI move an existing project to SRE? 

what if I already have a current project diagram

What if I already have a current project?

Does the PI have an existing project?

  • No: Is there a project request submitted with ICTR?
    • No: The PI should reach out to the RIC at EDA-Research@umm.edu to start the project request
    • Yes: Does it use data governed by UMMS?
      • Yes: Reach out to the RIC at EDA-Research@umm.edu to discuss moving the project to SRE.
        • UMB IT will also get involved after initial consultation with the RIC.
          • An SRE is created for your project
      • No: Reach out to UMB IT at SRE-Support@umaryland.edu to discuss moving the project to SRE
        • An SRE is created for your project
  • Yes: Does it use data governed by UMMS?
    • Yes: Reach out to the RIC at EDA-Research@umm.edu to discuss moving the project to SRE.
      • UMB IT will also get involved after initial consultation with the RIC.
        • An SRE is created for your project
    • No: Reach out to UMB IT at SRE-Support@umaryland.edu to discuss moving the project to SRE
      • An SRE is created for your project

 

How does data flow with the SRE 

How does data flow with the SRE

{Diagram of data flow, i.e. ingress and egress}

Microsoft Azure Secure Research Environment

Highly restricted inbound and outbound public and private network access

Data Steward

  • Epic or other data sources
  • Requested Data
    • Automation
      • Ingress Folder (Azure Virtual Desktop)
      • Egress Folder (Azure Virtual Desktop)
        • Automation

Researcher

  • Remote Desktop
    • Research Desktop (Azure Virtual Desktop)
      • Ingress Folder (Azure Virtual Desktop)
      • Egress Folder (Azure Virtual Desktop)

Principal investigators will need to work with an “honest broker” to obtain the data being used for their research. An honest broker is the data steward for the owner of research data who acts to collect and provide that data to research investigators. The data steward is responsible for ensuring the quality, security, and fitness of the data for the purpose of the research.

For data sources governed by UMMS, the RIC will act as the honest broker. For other data sources, UMB IT will act as the honest broker.

 

Is training needed/required in order to use the SRE? 

Formal training is not required. Using the SRE is as easy as using your own computer. Instructions on how to access the SRE and using your data to perform research will be provided once the SRE request has been fulfilled.

Who do I contact if I have a question or need assistance with the SRE? 

Process questions when using UMMS data to EDA-Research@umm.edu

Process questions when using non-UMMS, sensitive data to SRE-Support@umaryland.edu

Technical questions to SRE-Support@umaryland.edu

What is the SRE? 

The UMB Secure Research Environment (SRE) is a centralized virtual environment designed to protect sensitive and restricted research data.  Secure virtual desktop environments and custom compute configurations allow researchers to access sensitive data under a higher level of control and data protection. Data is segregated per research project and only accessible by the research team that is assigned to their research environment.

The SRE is an isolated environment where users can use the software programs and tools available in the SRE to conduct their research and analyses. The ingress (input) of sensitive data into the SRE is managed by members of the Research Informatics Core (RIC) and/or the UMB IT support group. The egress (output) of any sensitive data or files from the SRE is also controlled by members of the Research Informatics Core (RIC) and/or UMB IT support group. The egress of non-sensitive data, e.g., summary or aggregated data that does not contain PHI or PII, can also be performed.

 

Why do I Need to Use SRE? 

The SRE is now a required computing environment for using University of Maryland Medical System (UMMS) clinical data for research purposes. At this time, other PIs are invited to use it as well.  Principal investigators will use the SRE to protect their sensitive research data and intellectual property.

The SRE is a virtual computing space specifically designed to secure sensitive data, such as PII and PHI, and intellectual property. UMB in partnership with UMMS has built the SRE using the Microsoft Azure cloud computing infrastructure and is accessible via Azure Virtual Desktop. The Azure environment and selected tools do meet HIPAA’s standards for handling protected health information/patient data and other sensitive, personally identifiable data. The UMB and UMMS Data Use Agreement specifies the use of the Microsoft Azure infrastructure for the purpose of securing sensitive data.

What analytical tools/applications are currently available in SRE? 

Analytics tool available in SRE Diagram

Analytical Tools Available in SRE:

  • R
  • R Studio
  • Rtools
  • SAS*
  • SPSS*
  • Python
  • Visual Studio Code
  • MS Office
  • Ananconda3 2002.10
  • DataSpell 2023.1.1
  • PyCharm Community Edition

* Provide your license key to use these in SRE 

If you don't have a license, IT can help you get one

Need additional tools/software not on this list, or customized tool kits?

Contact UMB IT at SRE-Support@umaryland.edu to discuss getting it added to your SRE. UMB IT will work with you to either apply your license key or help you get one.

 

How Do Individuals Use Azure Virtual Desktop (AVD)? 

To access AVD, a general user or researcher simply opens a web browser, connects to the Azure AVD Portal & uses their institutional credentials to log into the AVD Portal where they will see Applications & Desktops published to their ID.

 

They have access to the applications and software that they need, perform work and research analyses via AVD, and save their data and documents directly through this interface to the Azure infrastructure.

 

The user experience is very similar to logging in remotely to a desktop, where a general user or researcher will see a personalized desktop and applications that are familiar to them.

How is compute and data storage configured for my project? 

Research needs are discussed with an IT engineer who will advise the PI as to the appropriate compute configuration and data storage needs required to support the project. 

Is there a limit on the file size or data storage capacity for my projects?  

There is no set limit on the data storage capacity, but to be judicious and cost-effective, a PI with the help of an IT engineer will determine an appropriate amount of storage that would support the project.

How long can I keep/store data in SRE?  

The data should reside in the SRE during the course of the research project. After project completion, a cost-effective and secure approach to the data should be used, e.g., the data should be securely archived or deleted.

What is the cost associated with using the SRE?  

The following are estimated benchmark compute and data storage costs when using the SRE. A review of the research project with an IT engineer will determine an appropriately sized computing configuration and data storage allocation.  The review will help determine which compute configuration and data storage allocation would be the most cost-effective solution to support the project.   

Compute  Estimated Yearly Cost per User
Standard Sized Virtual Machines (100 usage hrs./month) $100/year or approx. $0.08/hr.
Standard Sized Virtual Machines (200 usage hrs./month)    $200/year or approx. $0.08/hr. 
Standard Sized Virtual Machines (400 usage hrs./month)    $400/year or approx. $0.08/hr.
Standard Sized Virtual Machines (24x7 usage per month) $600/year or approx. $0.07/hr.
Enhanced Virtual Machines (High-Performance Computing, 24x7)   $4,100/year 
   
Data Storage  
Standard Datasets  (standard storage) $  .26 per gigabyte 
Large Complex Datasets (premium storage) $1.85 per gigabyte
   
Examples of Estimated Cost:   
Standard VMs used 100 hrs./mo. + 50 gigabytes of standard data storage: $113/yr.
Standard VMs used 100 hrs./mo. + 500 gigabytes of standard data storage: $230/yr.
Standard VMs used 200 hrs./mo. + 500 gigabytes of standard data storage:        $330/yr.
Standard VMs used 400 hrs./mo. + 1000 gigabytes of standard data storage:    $660/yr.
Standard VMs used 24x7 + 2000 gigabytes of standard data storage:    $1,120/yr.
Enhanced VMs (HPC running 24x7) + 4000 gigabytes of premium data storage:    $11,500/yr.

 

Please note that software is NOT included in the pricing.  The faculty researcher and/or the research sponsor would need to acquire software licenses needed for the project.    

APPENDICIES

Appendix A: What is Microsoft Azure? 

Azure, Microsoft’s cloud platform, is an evolving collection of integrated Cloud Services spanning compute, data storage, and software applications.

Appendix B: What are the Benefits of Using Microsoft Azure? 

Reduced operational overhead.  No need to:

  • Dedicate physical space for computing equipment.
  • Monitor hardware health, manage firmware, and repair failed hardware.
  • Perform complex hardware replacements.
  • Size, purchase, house, & maintain:
    • Server and data storage equipment
    • Datacenter networking equipment
    • Complex datacenter network connectivity
    • Uninterruptible power supply (UPS) equipment and power feeds
    • Large, expensive HVAC equipment

Capacity

  • Azure has massive compute capacity, virtually unlimited computing resources that can scale as needs grow. We have the ability to quickly provision resources, such as servers, in extremely large quantities, use those resources for as long as necessary and immediately de-provision them when they are no longer required.  This model eliminates the need for over-provisioning resources to meet unknown future demands.

Agility

  • Virtual servers can be provisioned and deployed quickly, rather than taking weeks or months needed to procure and configure on-campus equipment.

Redundancy

  • Microsoft has 69 Azure geographic regions, which offers system redundancy across regions.
    • Traditional on-premises redundancy requires doubling hardware which must be maintained for just-in-case situations and sits mostly idle. Microsoft’s hardware infrastructure is fully redundant with the cost spread across all Azure customers to minimize the cost of infrastructure redundancy to UMB. This alleviates concerns related to the availability and disaster recovery of on-campus data centers.

Availability

  • The Microsoft agreement with University of Maryland, Baltimore (UMB) assures high availability, with an almost 100% Azure uptime/availability.

Sustainability

  • Shift UMB power consumption for computing to renewable energy sources.
    • Microsoft is dedicated to their increased use of green and renewable energy sources to power their datacenters. Microsoft has a commitment to sustainability, making a $1 billion investment in a climate fund; UMB computing power consumption and carbon footprint will be reduced by using Microsoft Azure

Security

  • IT security and data protection is enhanced by leveraging Microsoft’s personnel and sophisticated security tools. Microsoft has over 3,500 security experts who continually monitor sensitive data stored in Azure. Microsoft invests over $1 billion annually in IT security.

Cost

  • The pay-as-you-go model for the cloud infrastructure only requires paying for those services (compute and storage) that are used and consumed over a particular period of time. There is a reduced cost to run Windows computers in Azure due to the Master agreement that UMB has with Microsoft; and we achieve cost savings with the pay-as-you-use subscription model.

Partnerships

  • Microsoft also has an Innovation/Research focus, having established partnerships with the National Science Foundation and National Institutes of Health to provide computing resources to research organizations, e.g., STRIDES program (Science and Technology Research Infrastructure for Discovery, Experimentation & Sustainability).

Appendix C: What is Azure Virtual Desktop (AVD)? 

AVD is a Microsoft Azure-based system used for accessing the Azure Cloud infrastructure. With an Internet connection, it provides access to applications and data in Azure.  The hardware used for access does not need strong computing capabilities since that work is handled on the virtual end in Azure. 

Appendix D: What are the Benefits of using Microsoft Azure Virtual Desktop? 

  • The Azure Virtual Desktop (AVD) infrastructure is an important element in enhancing the security of data. AVD provides secure access to data stored in highly secured computing environments.
  • AVD provides direct access, after logging in, to the software that you need and to your file/data storage.
  • The presentation of AVD is very similar to logging in remotely to your desktop.
  • AVD accounts can be quickly created.
  • The computing resources within an AVD account can quickly scale to meet the computing needs of the user.
  • There is a reduction in physical server hardware and hardware maintenance costs.
  • There is no longer a need to buy and use costly, high-end computers.
  • AVD supports multiple computing endpoints: Windows, Apple, Chromebook, and Android.
  • There is a persistent user experience, where an individual can get access to applications and data at any time and from anywhere.

 

Appendix E: More About the Secure Research Environment (SRE) Security 

The University of Maryland, Baltimore (UMB) Secure Research Environment (SRE) is a centralized virtual environment designed to protect sensitive and restricted research data.  Secure virtual desktop environments and custom compute allow researchers to access sensitive data under a higher level of control and data protection. Data is segregated per research project and only accessible by the research team that is assigned to the enclave.

 

Azure Defender for Cloud helps keep your data and applications safe when you're using Microsoft's Azure cloud services. It scans for any suspicious activity or potential problems and takes action to prevent or address them, making your cloud environment more secure.  It will be enabled for all subscriptions as part of the deployment automation.

 

  • User authentication is configured to the existing UMB Azure active directory tenant and active directory service.
  • Private network access is isolated from existing UMB networks.
  • All access to the secure enclave resources will be via endpoints in AVD.
  • Monitoring, logging and reporting will be via Azure Log Analytics Workspace in the SRE Environment.
  • Approved data is brought in and out of project-specific secured enclaves via an Honest Broker/Data Steward.
  • Only de-identified data is allowed to leave the SRE environment.
  • Access to the public internet is blocked from within the SRE environment.
  • A NIST 800-171 compliance policy will be applied as a default to research subscriptions; research/funding source requirements may require NIST 800-53 to be applied in certain instances.
  • All Platform as a Service (PaaS) services will be deployed with private endpoints and public access disabled except where required.
  • Azure Cloud Security Posture Management is enabled.
  • Defender for Cloud Workload Protection enabled where required.