summaryrefslogtreecommitdiff
path: root/drivers/cpufreq/powernv-cpufreq.c
AgeCommit message (Collapse)Author
2015-09-01cpufreq: powernv: Increase the verbosity of OCC console messagesShilpasri G Bhat
Modify the OCC reset/load/active event message to make it clearer for the user to understand the event and effect of the event. Suggested-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-07-28cpufreq: powernv: Restore cpu frequency to policy->cur on unthrottlingShilpasri G Bhat
If frequency is throttled due to OCC reset then cpus will be in Psafe frequency, so restore the frequency on all cpus to policy->cur when OCCs are active again. And if frequency is throttled due to Pmax capping then restore the frequency of all the cpus in the chip on unthrottling. Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-07-28cpufreq: powernv: Report Psafe only if PMSR.psafe_mode_active bit is setShilpasri G Bhat
On a reset cycle of OCC, although the system retires from safe frequency state the local pstate is not restored to Pmin or last requested pstate. Now if the cpufreq governor initiates a pstate change, the local pstate will be in Psafe and we will be reporting a false positive when we are not throttled. So in powernv_cpufreq_throttle_check() remove the condition which checks if local pstate is less than Pmin while checking for Psafe frequency. If the cpus are forced to Psafe then PMSR.psafe_mode_active bit will be set. So, when OCCs become active this bit will be cleared. Let us just rely on this bit for reporting throttling. Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-07-28cpufreq: powernv: Call throttle_check() on receiving OCC_THROTTLEShilpasri G Bhat
Re-evaluate the chip's throttled state on recieving OCC_THROTTLE notification by executing *throttle_check() on any one of the cpu on the chip. This is a sanity check to verify if we were indeed throttled/unthrottled after receiving OCC_THROTTLE notification. We cannot call *throttle_check() directly from the notification handler because we could be handling chip1's notification in chip2. So initiate an smp_call to execute *throttle_check(). We are irq-disabled in the notification handler, so use a worker thread to smp_call throttle_check() on any of the cpu in the chipmask. Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-07-28cpufreq: powernv: Register for OCC related opal_message notificationShilpasri G Bhat
OCC is an On-Chip-Controller which takes care of power and thermal safety of the chip. During runtime due to power failure or overtemperature the OCC may throttle the frequencies of the CPUs to remain within the power budget. We want the cpufreq driver to be aware of such situations to be able to report the reason to the user. We register to opal_message_notifier to receive OCC messages from opal. powernv_cpufreq_throttle_check() reports any frequency throttling and this patch will report the reason or event that caused throttling. We can be throttled if OCC is reset or OCC limits Pmax due to power or thermal reasons. We are also notified of unthrottling after an OCC reset or if OCC restores Pmax on the chip. Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-07-28cpufreq: powernv: Handle throttling due to Pmax capping at chip levelShilpasri G Bhat
The On-Chip-Controller(OCC) can throttle cpu frequency by reducing the max allowed frequency for that chip if the chip exceeds its power or temperature limits. As Pmax capping is a chip level condition report this throttling behavior at chip level and also do not set the global 'throttled' on Pmax capping instead set the per-chip throttled variable. Report unthrottling if Pmax is restored after throttling. This patch adds a structure to store chip id and throttled state of the chip. Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-04-01cpufreq: powernv: Report cpu frequency throttlingShilpasri G Bhat
The power and thermal safety of the system is taken care by an On-Chip-Controller (OCC) which is real-time subsystem embedded within the POWER8 processor. OCC continuously monitors the memory and core temperature, the total system power, state of power supply and fan. The cpu frequency can be throttled by OCC for the following reasons: 1)If a processor crosses its power and temperature limit then OCC will lower its Pmax to reduce the frequency and voltage. 2)If OCC crashes then the system is forced to Psafe frequency. 3)If OCC fails to recover then the kernel is not allowed to do any further frequency changes and the chip will remain in Psafe. The user can see a drop in performance when frequency is throttled and is unaware of throttling. So detect and report such a condition, so the user can check the OCC status to reboot the system or check for power supply or fan failures. The current status of the core is read from Power Management Status Register(PMSR) to check if any of the throttling condition is occurred and the appropriate throttling message is reported. Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-09-29cpufreq: powernv: Set the cpus to nominal frequency during reboot/kexecShilpasri G Bhat
This patch ensures the cpus to kexec/reboot at nominal frequency. Nominal frequency is the highest cpu frequency on PowerPC at which the cores can run without getting throttled. If the host kernel had set the cpus to a low pstate and then it kexecs/reboots to a cpufreq disabled kernel it would cause the target kernel to perform poorly. It will also increase the boot up time of the target kernel. So set the cpus to high pstate, in this case to nominal frequency before rebooting to avoid such scenarios. The reboot notifier will set the cpus to nominal frequncy. Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-09-29cpufreq: powernv: Set the pstate of the last hotplugged out cpu in ↵Preeti U Murthy
policy->cpus to minimum Its possible today that the pstate of a core is held at a high even after the entire core is hotplugged out if a load had just run on the hotplugged cpu. This is fair, since it is assumed that the pstate does not matter to a cpu in a deep idle state, which is the expected state of a hotplugged core on powerpc. However on powerpc, the pstate at a socket level is held at the maximum of the pstates of each core. Even if the pstates of the active cores on that socket is low, the socket pstate is held high due to the pstate of the hotplugged core in the above mentioned scenario. This can cost significant amount of power loss for no good. Besides, since it is a non active core, nothing can be done from the kernel's end to set the frequency of the core right. Hence make use of the stop_cpu callback to explicitly set the pstate of the core to a minimum when the last cpu of the core gets hotplugged out. Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-08-05powerpc/cpufreq: Add pr_warn() on OPAL firmware failuresVaidyanathan Srinivasan
Cpufreq depends on platform firmware to implement PStates. In case of platform firmware failure, cpufreq should not panic host kernel with BUG_ON(). Less severe pr_warn() will suffice. Add firmware_has_feature(FW_FEATURE_OPALv3) check to skip probing for device-tree on non-powernv platforms. Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Acked-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> CC: <stable@vger.kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-05-16cpufreq: powernv: make local function staticBrian Norris
powernv_cpufreq_get() is only referenced in this file. Signed-off-by: Brian Norris <computersforpeace@gmail.com> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> on V2. Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-04-21cpufreq, powernv: Fix build failure on UPSrivatsa S. Bhat
Paul Gortmaker reported the following build failure of the powernv cpufreq driver on UP configs: drivers/cpufreq/powernv-cpufreq.c:241:2: error: implicit declaration of function 'cpu_sibling_mask' [-Werror=implicit-function-declaration] cc1: some warnings being treated as errors make[3]: *** [drivers/cpufreq/powernv-cpufreq.o] Error 1 make[2]: *** [drivers/cpufreq] Error 2 make[1]: *** [drivers] Error 2 make: *** [sub-make] Error 2 The trouble here is that cpu_sibling_mask is defined only in <asm/smp.h>, and <linux/smp.h> includes <asm/smp.h> only in SMP builds. So fix this build failure by explicitly including <asm/smp.h> in the driver, so that we get the definition of cpu_sibling_mask even in UP configurations. Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-04-07cpufreq: powernv: Use cpufreq_frequency_table.driver_data to store pstate idsGautham R. Shenoy
The .driver_data field in the cpufreq_frequency_table was supposed to be private to the drivers. However at some later point, it was being used to indicate if the particular frequency in the table is the BOOST_FREQUENCY. After patches [1] and [2], the .driver_data is once again private to the driver. Thus we can safely use cpufreq_frequency_table.driver_data to store pstate_ids instead of having to maintain a separate array powernv_pstate_ids[] for this purpose. [1]: Subject: cpufreq: don't print value of .driver_data from core From : Viresh Kumar <viresh.kumar@ linaro.org> url : http://marc.info/?l=linux-pm&m=139601421504709&w=2 [2]: Subject: cpufreq: create another field .flags in cpufreq_frequency_table From : Viresh Kumar <viresh.kumar@linaro.org> url : http://marc.info/?l=linux-pm&m=139601416804702&w=2 Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-04-07cpufreq: powernv: cpufreq driver for powernv platformVaidyanathan Srinivasan
Backend driver to dynamically set voltage and frequency on IBM POWER non-virtualized platforms. Power management SPRs are used to set the required PState. This driver works in conjunction with cpufreq governors like 'ondemand' to provide a demand based frequency and voltage setting on IBM POWER non-virtualized platforms. PState table is obtained from OPAL v3 firmware through device tree. powernv_cpufreq back-end driver would parse the relevant device-tree nodes and initialise the cpufreq subsystem on powernv platform. The code was originally written by svaidy@linux.vnet.ibm.com. Over time it was modified to accomodate bug-fixes as well as updates to the the cpu-freq core. Relevant portions of the change logs corresponding to those modifications are noted below: * The policy->cpus needs to be populated in a hotplug-invariant manner instead of using cpu_sibling_mask() which varies with cpu-hotplug. This is because the cpufreq core code copies this content into policy->related_cpus mask which should not vary on cpu-hotplug. [Authored by srivatsa.bhat@linux.vnet.ibm.com] * Create a helper routine that can return the cpu-frequency for the corresponding pstate_id. Also, cache the values of the pstate_max, pstate_min and pstate_nominal and nr_pstates in a static structure so that they can be reused in the future to perform any validations. [Authored by ego@linux.vnet.ibm.com] * Create a driver attribute named cpuinfo_nominal_freq which creates a sysfs read-only file named cpuinfo_nominal_freq. Export the frequency corresponding to the nominal_pstate through this interface. Nominal frequency is the highest non-turbo frequency for the platform. This is generally used for setting governor policies from user space for optimal energy efficiency. [Authored by ego@linux.vnet.ibm.com] * Implement a powernv_cpufreq_get(unsigned int cpu) method which will return the current operating frequency. Export this via the sysfs interface cpuinfo_cur_freq by setting powernv_cpufreq_driver.get to powernv_cpufreq_get(). [Authored by ego@linux.vnet.ibm.com] [Change log updated by ego@linux.vnet.ibm.com] Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>