Monday, May 21, 2012

Customizing SSDT: Power Management Optimization on 10.7.4


With the release of SSDTs in MultiBeast, I decided to limit the Overclocked SSDT to a maximum clock of 4.2 GHz since that is the maximum that About This Mac can report using auto detection.

Over the past few days, numerous users have been asking how to customize their SSDT to run at a higher clock rate, and how to see more P-states on a non-overclocked system. This post aims to explain the SSDT and how to customize it.

Lets look at the first block of code:

    External (\_PR_.CPU7, DeviceObj)
    External (\_PR_.CPU6, DeviceObj)
    External (\_PR_.CPU5, DeviceObj)
    External (\_PR_.CPU4, DeviceObj)
    External (\_PR_.CPU3, DeviceObj)
    External (\_PR_.CPU2, DeviceObj)
    External (\_PR_.CPU1, DeviceObj)
    External (\_PR_.CPU0, DeviceObj)

This block defines how many real or hyper-thread CPU's are available. So in this example, we are defining up to 8 CPUs. If your CPU has less (Core i3 or Core i5) it is safe to delete External (\_PR_.CPU4, DeviceObj) thru External (\_PR_.CPU7, DeviceObj). But then you will need to delete the corresponding entries Scope (\_PR.CPU4) thru Scope (\_PR.CPU7) at the end of the SSDT.

Let's look at the next section:

    Scope (\_PR.CPU0)
    {
        Name (APSN, 0x04)
        Name (APSS, Package (0x1B)
        {

Here we are defining the details for CPU0. The first line is assigning 0x4 to APSN which is an Apple-specific object. The next object APSS is also Apple-specific, but is defined the same as the ACPI object _PSS. The key thing in this line is Package (0x1B) as 0x1B defines in hexadecimal the number of definitions contained within. So 0x1B corresponds to 27 decimal. This means we have 27 states defined.

Next let's look at the top most entry but with comments added based on the ACPI Specification for _PSS:

            Package (0x06)
            {
                0x1068,   // CoreFreq
                Zero,        // Power
                0x0A,      //  TransitionLatency
                0x0A,      //  BusMasterLatency
                0x2A00,  // Control
                0x2A00   // Status
            },

From ACPI Specification- Section 8.4.4.2:

CoreFreq. Indicates the core CPU operating frequency (in MHz).
Power. Indicates the performance state’s maximum power dissipation (in milliWatts).
TransitionLatency. Indicates the worst-case latency in microseconds that the CPU is unavailable during a transition from any performance state to this performance state.
BusMasterLatency. Indicates the worst-case latency in microseconds that Bus Masters are prevented from accessing memory during a transition from any performance state to this performance state.
Control. Indicates the value to be written to the Performance Control Register (PERF_CTRL) in order to initiate a transition to the performance state.
Status. Indicates the value that OSPM will compare to a value read from the Performance Status Register (PERF_STATUS) to ensure that the transition to the performance state was successful. OSPM may always place the CPU in the lowest power state, but additional states are only available when indicated by the _PPC method.

So in our example CoreFreq is 0x1068 is 4200 or 4.2 GHz. It seems that Apple is ignoring the value for Power, though you can use the Power values from the motherboard's SSDT extract. I've found that TransitionLatency and BusMasterLatency are always 0x0A. Now for the final two values, 0x2A00 or 4200. In this case we ignore the two zeros and are only concerned with the two numbers. This is the value that will be displayed by MSRDumper. So when the system is running at 4.2 GHz MSRDumper will show the P-state as 42.

Ok now let's look at the next entry.

            Package (0x06)
            {
                0x1004,
                Zero,
                0x0A,
                0x0A,
                0x2900,
                0x2900
            },

So 0x1004 is 4100 or 4.1 GHz and 0x2900 translates to 41. Pretty simple. If you keep following down you will see that CoreFreq decrease by 0x64 or 100 for each state and Control and Status decrease by 1. The last entry will have a CoreFreq of 0x0640 and Control and Status will be 0x1000. This corresponds to 1600 or 1.6 GHz and 16, the minimum clock rate for a desktop Sandy Bridge CPU.

Now that that has been explained lets look at the next Scope.

Scope (\_PR.CPU1)
    {
        Method (APSS, 0, NotSerialized)
        {
            Return (\_PR.CPU0.APSS)
        }
    }

All this is showing is that instead of repeating what we did for the CPU0, we can have the SSDT return what was defined for CPU0. Alternately, you can define individual P-states for each processor, but it is a very time consuming process. The way I've described is much easier and less work.

Finally we can discuss what we need to change when modifying the SSDT. If you remove P-states you must decrease the value in Name (APSS, Package (0x1B) to reflect the number of states deleted. The MultiBeast i7 SSDT has a maximum clock of 3.9 GHz. The means the top entry will look like this:

        Name (APSS, Package (0x18)
        {
            Package (0x06)
            {
                0x0F3C,
                Zero,
                0x0A,
                0x0A,
                0x2700,
                0x2700
            },

So the Package count went down by 3 from Package (0x1B) to  Name (APSS, Package (0x18) and the first package shows the CoreFreq of 0x0F3C is 3900 or 3.9 GHz and Control and Status are now 0x2700 or 39.

Likewise if you want to add states, remember to add 0x64 to each CoreFreq and 1 to Control and Status and make sure that the value in Name (APSS, Package (0x18) corresponds to the total number of P-states defined.

Something everyone should be aware of is that there is a limit to the number of entries that AppleIntelCPUPowerManagement can handle. If you define more P-states then you can use, like using the Overclock SSDT on a non-overclocked system, you will lose P-states at the bottom of the table. When I used the MultiBeast Overclock SSDT which has a maximum clock of 4.2 GHz on a system that can reach a maximum of 3.8 GHz you will see the following from MSRDumper:

MSRDumper PStatesReached: 16 23 24 25 26 27 38

Versus using the MultiBeast i7 SSDT where you will see:

MSRDumper PStatesReached: 16 17 18 19 20 21 35 36 37 

You may say what's the big deal? Well it relates to power consumption. The lower the P-state (CPU frequency) the lower the power consumption. So by using the Overclocked SSDT my power consumption will be higher which will cost me money in the long run.

I hope you found this information helpful!

-MacMan

For discussions on this and other topics, register today at tonymacx86.com!