Investigating & Preventing Kernel Panics in the 2019 Mac Pro

Mac Pro Rack 2019 - top

Nobody likes to experience a kernel panic on their Mac. This is when your Mac restarts itself while you are working. After it finishes the restart process, you see a dialog telling you that your Mac shut down because of a problem. Worse yet, you may have lost all of your recent work as files are not saved during this type of restart.

We Start Hearing From Our Customers

A kernel panic is even worse when it is happening on your brand new 2019 Mac Pro, a computer that may have cost $10,000 or more. So when we started to hear that a few customers were experiencing kernel panics on their 2019 Mac Pros, we wanted to quickly figure out what was happening. These panics only occurred in a very small percentage of customers who had installed our OWC Accelsior 4M2 cards. The log from macOS said that the kernel panic was caused by an error on the PCIe bus, the wires on the circuit board, which connects cards to the Intel processor. This kernel panic was most often occurring to customers with 2 or more OWC Accelsior 4M2 cards installed.

We Investigate The Problem

Through many hours of dedicated work, our teams figured out what hardware configurations are required for a kernel panic to occur. We worked literally around the clock investigating this problem: when our software development team in California finish testing for the day, our hardware design team in Taipei, on the other side of the world, would pick up where the California team left off.

We determined that the kernel panic occurs when the 2019 Mac Pro sleeps. More importantly, we figured out that the problem only occurs when the OWC Accelsior 4M2 is installed in slots 4 or 5 and only if those slots are configured to use pool B of the PCIe lanes (see below). If OWC Accelsior 4M2 is in any other slot, or if it is configured to use pool A, no kernel panics occur.

The teams also found other cards, which, when installed, result in a kernel panic. One example is the Highpoint SSD7101A.

Getting Help From Our Chip Supplier

We then contacted our PCIe chip supplier in Taiwan to get help determining why macOS thought there was an error on the PCIe bus. Their engineers analyzed the signals on the PCIe bus and told us that macOS was mistakenly indicating that there is an error when, in fact, all the signals were correct.

Apple Starts Investigating

We have since reported this problem to Apple and are working with them to develop a solution. Like the problem we discovered recently — where a Mac hangs while transferring large files — I am confident that they will develop a quick and reliable fix for this problem.

How to Prevent the Kernel Panic From Occurring

Since the kernel panic occurs only when the OWC Accelsior 4M2 is in slots 4 or 5, and the slot is using pool B of the PCIe lanes, the solution for your 2019 Mac Pro is easy. You can either move the card to a different slot or change the slot to use pool A of the PCIe lanes.

Move the OWC Accelsior 4M2 Card to a Slot Other Than Slot 4 or 5

You can move your OWC Accelsior 4M2 card to slots 1, 2, 3, 6, or 7, and this will prevent the kernel panic from occurring. Slot 8 is always occupied by the Apple Thunderbolt card.

Inside Mac Pro 2019 showing PCIe slots
Note: Slot 2 is normally covered by a graphics card. If you install your OWC Accelsior 4M2 in slot 2, you will only be able to use one graphics card, installed in slot 3.

Change the Slot With the OWC Accelsior 4M2 Card to Use PCIe Lanes in Pool A

To change the slot which contains the OWC Accelsior 4M2 to use the lanes in pool A, follow these simple steps:

1) Select About this Mac from the  menu.

2) Click on the PCI Cards tab in the About This Mac window.

3) Click on the Expansion Slot Utility… button.

4) Click on the Automatic Bandwidth Configuration checkbox to deselect it.

5) Click on the button in the A column next to the slot containing your OWC Accelsior 4M2. (In this example, it is slot 4.)

6) When you are done, the Expansion Slot Utility window should look like this.

7) Click the close box in the top left of the Expansion Slot Utility window. You will be prompted to save your changes, and then your 2019 Mac Pro will restart.



LEAVE A COMMENT


  • I’m running a Mac pro 2019 with 3 Accelsior 4M2 cards in slots 3, 4, 5.
    Slot 1-2 AMD 5700x, slot 6 avid HD Native, slot 7 Sonnet USB and slot 8 Apple I/O. OS Catalina 10.15.3

    Despite running the NVMe cards in the B pool I still experience Sleep/Wake kernel panics. There is often another, similar error that is logged as, ‘PowerOff timed out in phase “Notifying power plane drivers”.

    The errors always occur at boot.

    Sometimes, after shutting down, not being put in sleep mode, the MP will reboot itself as if it’s actually been sleeping, not having been shutdown, sort of an involuntary resurrection.

    This behavior is inconsistent, creating the impression that some juggling of parameters my have solved the problem, but no.

    At one point, allocating the NVMe cards to the B pool seemed to be a cure but that only worked when shutting down while leaving the MP connected to power. Once the master power was shut off then turned on the next day, the boot error reappeared.

    Is there any news of progress on Apples side regarding a fix? I would move to the latest version of Catalina but it doesn’t yet support my principle working app. If that provided a fix I assume word would have been out by now. This is a depressing to way to start off with such an ostensibly fine machine.




    • We have confirmed that this kernel panic still occurs after updating to macOS 10.15.5.

      Your Accelsior 4M2 cards in slots 4 and 5 should be in slot A, not in slot B as you have them currently configured. This has worked for the sleep kernel panic and may fix your problem.

      I will ask our tester at our headquarters in Woodstock to try and repeat your problems with booting as we only have 2 Accelsiors 4M2 cards here in California.

      I don’t have any news from Apple, nor do I expect any until after they ship a fix for this.




  • For what it’s worth, I have this kernel panic occur about 50 percent of the time when my Mac Mini awakes from sleep. I have a 3TB Mercury Elite Pro attached for nightly backups of the Mini. The kernel attacks began happening about the same time I installed Mojave and the Mercury Elite Pro. I’ve tried in vain to find a solution. I always send the crash report log to Apple but am not qualified to read or understand it myself, so don’t know where/why the kernel panic occurs. Could it be related in any way to the issue discussed in this article?




    • No, this issue only happens on the 2019 Mac Pro.

      I know how frustrating it can be to track down a kernel panic. You might want to contact the OWC support center and get some help from them.




    • I realized that I am experiencing a kernel panic on my Mac mini as well, all of the time it goes to sleep (2018 Mac mini). I have 2 startup volumes on this Mac, one for Mojave and the other for Catalina. The kernel panic only occurs when I am starting up from Mojave, so you might want to try creating a Catalina volume and trying that for a few days.

      When I dig into the kernel panic, I see that the panic is caused by Apple’s T2 chip (the operating system which is panicking is running on an ARM chip).

      I also have no external storage devices attached to the Mac mini, so I don’t think your problems are related to the Mercury Elite Pro.




  • I was wondering when that PLX switch would start causing trouble for MP owners. Sure didn’t take very long. Of course, if they had offered a dual CPU version then available lanes wouldn’t have been an issue. IMO Apple is trying to do too much with too little.

    Your engineers noted that the signals at the PCIe pins was correct. Did they also test the signal for deviations after it had crossed the PLX? I’d lay a ten spot that the hyper fast I/O nature of the equipment in question here was sensitive enough to the (mis)timing caused by the PLX switching and it was just enough to KP the machine.

    PC motherboard manufacturers tried using PLX switches to give users more lanes on their high end mainstream boards a few years ago before Intel had brought out the X series HEDT CPUs and motherboards. It didn’t work very well. There were timing issues galore. And that was before we had NVMe and Optane in the mix.

    I’d like to say this is the last we’ll see of the problem seen here, but I doubt it. Unless Apple reduces the I/O rate below the threshold that triggers the mismatched timings, it’s likely other high speed I/O cards will also suffer when using Pool B in certain slots. :(




  • Apple has really slipped up lately. They seem to be spending a whole lot of time and money on Memoji and emojis when they should be spending it on actually making their expensive systems actually work. I’ve now been bitten by both this bug and the large file copy bug — and Apple said both were most certainly not their fault.

    HAH!




  • That’s a really good debugging work!

    I congratulate your software group and your hardware design team for all the work you put into this. Far too many vendors would’ve just thrown it over the fence at Apple and just said “something is wrong.”

    Sleep seems to be a problematic area; since I moved to macOS Mojave, I sometimes have issues waking up my MacPro 5,1 as the OS sometimes has problems waking up the OS drive, a PCIe Drive Kit with OWC Mercury Electra 6G SSD.

    Thankfully that hang is rare enough that it’s only a minor nuisance.




    • Thank you for the complement. I will pass it along to the members of both teams who put in the long hours.




      • This is fantastic troubleshooting work. I’ve suffered through this issue and at least 20 hours of troubleshooting through Applecare. They even replaced my Mac’s logic board, but that didn’t fix the sleep issue with two 4x4s in slots 4 and 5.

        I wish I could just move both to Pool A, but my boards are both x16 and there doesn’t seem to be enough bandwidth to get the maximum speed from my drives.

        I hope Apple follows your cue and issues a software update. I also hope Apple rewards you for catching this, because it has undoubtedly cost them a lot of money in needless repairs.




      • Terrific debugging! Thanks.
        I have two new MacPros each with two Accelsior 4M2 Cards.
        For one reason or another, I have those boxes set to never sleep.

        I have not seen this problem but I have been on lock-down most of the time I’ve had the boxes deployed.

        Am I safe?