Known Issues

Updates on know issues affecting the ePortfolio

Unscheduled downtime event – Thursday 20th July 2017

A failure with the networking infrastructure at our hosting provider (specifically, the Microsoft Azure UKSouth region) resulted in the NHS ePortfolios website at http://www.nhseportfolios.org becoming unavailable/unresponsive to all users on three occasions on Thursday the 20th July 2017:

  • At 21:49 to 21:55 (6 minutes)
  • At 22:05 to 22:10 (5 minutes)
  • At 22:29 to 00:58 (149 minutes)

Following identification of the cause of the outage, updates were provided via the @NHSePortfolios Twitter account between 23:13 and 01:05.

No data was damaged / compromised as a result of this incident.

Update, 25 July 2017: Root Cause Analysis provided by hosting provider:

RCA – Network Infrastructure – UK South

Summary of impact: Between July 20, 2017 21:41 UTC and July 21, 2017 1:40 UTC, a subset of customers may have encountered connectivity failures for their resources deployed in the UK South region. Customers would have experienced errors or timeouts while accessing their resources. Upon investigation, the Azure Load Balancing team found that the data plane for one of the instances of Azure Load Balancing service in UK South region was down. A single instance of Azure Load Balancing service has multiple instances of data plane. It was noticed that all data plane instances went down in quick succession and failed repeatedly whilst trying to self-recover. The team immediately started working on the mitigation to fail over from the offending Azure Load Balancing instance to another instance of Azure Load Balancing service. This failover process was delayed due to the fact that VIP address of Azure authentication service used to secure access to any Azure production service in that region was also being served by the Azure Load Balancing service instance that went down. The Engineering teams resolved the access issue and then recovered the impacted Azure Load Balancing service instance by failing over the impacted customers to another instance of Azure Load Balancing service. The dependent services recovered gradually once the underlying load balancing service instance was recovered. Full recovery by all of the affected services was confirmed by 01:40 UTC on 21 July 2017.

Workaround: Customers who had deployed their services across multiple regions could fail out of UK South region.

Root cause and mitigation: The issue occurred when one of the instances of Azure Load Balancing service went down in the UK South region. The root cause of the issue was a bug in the Azure Load Balancing service. The issue was exposed due to a specific combination of configurations on this load balancing instance combined with a deployment specification that caused the data plane of the load balancing service to crash. There are multiple instances of data plane in a particular instance of Azure Load Balancing Service. However, due to this bug, the crash cascaded through multiple instances. The issue was recovered by failing over from the specific load balancing instance to another load balancing instance. The software bug was not detected in deployments in prior regions because it only manifested under specific combinations of the configuration in Azure Load Balancing services. The combination of configurations that exposed this bug was addressed by recovering the Azure Load Balancing service instance.

Next steps: We sincerely apologize for the impact to affected customers. We are continuously taking steps to improve the Microsoft Azure Platform and our processes to help ensure such incidents do not occur in the future. In this case, we will: 1. Roll out a fix to the bug which caused Azure Load Balancing instance data plane to crash. In the interim a temporary mitigation has been applied to prevent this bug from resurfacing in any other region. 2. Improve test coverage for the specific combination of configuration that exposed the bug. 3. Address operational issues for Azure Authentication services break-glass scenarios.

Unscheduled Downtime Event – 12th May 2017 & 13th May 2017

08:27 on Saturday the 13th May 2017

Access to the NHS ePortfolios platform at https://www.nhseportfolios.org has now been restored.

08:25 on Saturday the 13th May 2017

We have now received notification that access to the NHS ePortfolios platform can be restored and are working to action this.

06:55 on Saturday the 13th May 2017

The NHS ePortfolios platform remains offline as part of co-ordinated containment efforts.

Further information will be posted here and via the @NHSePortfolios Twitter account when we have an indication as to when the platform will be re-enabled.

22:55 on Friday the 12th May 2017

The NHS ePortfolios platform still remains offline as part of co-ordinated containment efforts.

Further information will be posted here and via the @NHSePortfolios Twitter account when we have an indication as to when the platform will be re-enabled.

20:55 on Friday the 12th May 2017

The NHS ePortfolios platform remains offline as part of co-ordinated containment efforts following the rapid spread of Malware within associated but disconnected networks.

Further information will be posted here and via the @NHSePortfolios Twitter account when we have an indication as to when the platform will be re-enabled.

18:00 on Friday the 12th May 2017

The NHS ePortfolios team within NHS Education for Scotland have complied with an official request and temporarily prevented access to the NHS ePortfolio platform at https://www.nhseportfolios.org.  This is a precautionary measure, taken to ensure the safety of users’ data.

Further information will be posted here and via the @NHSePortfolios Twitter account when we have an indication as to when the platform will be re-enabled.

 

“No OpenID endpoint found” error creating new link from NHS ePortfolios to e-LfH accounts

UPDATE: 9th November 2016, 8pm.  We have now received notification from e-LfH that the OpenID endpoint has been restored and as such users will once again be able to establish links between their NHS ePortfolios user account and e-LfH user account.

The NHS ePortfolios team are aware that users will presently encounter issues when attempting to establish a new link between an NHS ePortfolios account and an e-LfH user account to enable the exchange of Learning Activity data.

When attempting to establish the links, users are presently witnessing an error message within NHS ePortfolios stating “No OpenID endpoint found”, as shown below:

NHS ePortfolios - No OpenID Endpoint found error.

This issue has arisen following an application upgrade at e-LfH at the end of last week and the e-LfH team are working to resolve this as quickly as they can:

e-LfH OpenID error

If you have already established a link between your NHS ePortfolios account and your e-LfH user account, the ongoing exchange of Learning Activity data will not be affected by this issue.

 

 

ePortfolio Messaging Incident – 4th November 2015

The NHS ePortfolios system contains a facility called “messaging”.  This facility allows users of NHS ePortfolios to send messages to other users without leaving the NHS ePortfolios system. In addition to allowing trainees and their supervisors to exchange messages, the facility also allows programme directors and administrators to communicate with groups of users to which they have been assigned permissions.  No messages are filtered/censured based upon content.

The NHS ePortfolios messaging facility is not an email replacement or email relay service.  Users’ email addresses are not required for the use of, and not disclosed by, the messaging facility.  If a user receives a message within this facility whilst not logged in, the NHS ePortfolios system will send a one-time notification to the users’ email address informing them that they have an unread message they may wish to check when they next-login.

On the 4th November at 10:18am, an NHS ePortfolios user with the “Physician Administrator” role, chose to use the NHS ePortfolios messaging system to send a message with two attachments to all users of the “Physician Trainee” role within the location to which they have been assigned permission (approx. 550 recipients).

The ability of the user to send a message to this audience is by design – no security system was subverted to allow this message to be sent and the message was not sent/delivered to an audience to whom the user did not have appropriate permissions.

Upon receipt by users, the “sender name” was correctly displayed as that of the user that sent the message – the sender name was not obfuscated in any manner.

Unscheduled downtime – Friday 8th May 2015

At 2:07pm on Friday the 8th May 2015 for three minutes and again, at 2:19pm for nine minutes, ending at 2:28pm, a failure of equipment at our hosting provider resulted in the NHS ePortfolios website at http://www.nhseportfolios.org becoming unavailable/unresponsive to some users.

During this time, users would have encountered disruption / errors as our systems transferred users away from failing devices to other, still functional, devices.

At 2:24pm, for four minutes, the NHS ePortfolios website was online, but user requests were being serviced by only one of three devices that are normally responsible for this task. At the time, around 600 users were making approximately 2,500 requests per minute. Response times from a single device would have resulted in an unacceptable user experience at this time.

Intermittent outages affecting a single device continued to be experienced until 3:23pm.

Shortly after midnight, a configuration change was performed by our hosting provider and disruption has subsequently not recurred.

Need help?
If you are having a problem with you NHS ePortfolio account please contact the Support team via the Help section of you account, or email us at support@nhseportfolios.org
Archives
Categories