summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2013-05-24fmd: fmd21.1 integrationMandy Lavi
Signed-off-by: Mandy Lavi <mandy.lavi@freescale.com> - PFC Adjustments for PFC configuration constraints and limitations related to port prefetch mode - workaround ucode issues Fix the following HW erratas regarding discard/error frames on V3: FM_OP_NO_VSP_NO_RELEASE_ERRATA_FMAN_A006675 - Description: OP without VSP will cause buffer leaks when instructed to discard a frame. Workaround: FW will release the buffers. FM_ERROR_VSP_NO_MATCH_SW006 - Description: Any port with VSP enabled and multiple VSPs are configured on this port can cause a situation where an error frame will be enqueued to the error queue not with the default VSP. Workaround: FW will replaced the current VSP with the default VSP just before the frame is being enqueued to the error queue. - Chosen-node new parameter support errors-to-discard Usage: optional Value type: <u32> Definition: Specifies which errors should be discarded. Errors that are not in the mask, will not be discarded; I.e. those errors will be enqueued and sent to the default error queue. Change-Id: Ib468c67de88376e17d9c39ab5a0c8fc5b33b7b82 Reviewed-on: http://git.am.freescale.net:8181/2605 Tested-by: Review Code-CDREVIEW <CDREVIEW@freescale.com> Reviewed-by: Bucur Madalin-Cristian-B32716 <madalin.bucur@freescale.com> Reviewed-by: Garg Vakul-B16394 <vakul@freescale.com> Reviewed-by: Radulescu Ruxandra Ioana-B05472 <ruxandra.radulescu@freescale.com> Reviewed-by: Chereji Marian-Cornel-R27762 <marian.chereji@freescale.com> Reviewed-by: Wang Haiying-R54964 <Haiying.Wang@freescale.com> Reviewed-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
2013-05-24gianfar: avoid allocating new skb to rx ring for fwd packets.Rajan Gupta
Patch enhances the performance for forwarded traffic by re-using the skb in rx ring instead of allocating a new skb. Changes are done under RX_TX_BUFF_XCHG patch which gets enabled for ASF enabled kernel only. Patch also removes per-cpu variables used in RX_TX_BUFF_XCHG patch, replacing with parameters in skbuff structures to enhance the performance. Change-Id: I2b3c1ec80fe3ef21ade9ce881d5cb86695169518 Signed-off-by: Rajan Gupta <rajan.gupta@freescale.com> Reviewed-on: http://git.am.freescale.net:8181/2678 Tested-by: Review Code-CDREVIEW <CDREVIEW@freescale.com> Reviewed-by: Manoil Claudiu-B08782 <claudiu.manoil@freescale.com> Reviewed-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
2013-05-20Merge tag 'v3.8.13'Scott Wood
This is the 3.8.13 stable release
2013-05-17qman: use math64 instead of direct 64-bit division.Haiying Wang
In 32-bit builds, the ROUNDING macro was breaking kernel assumptions when fed with 64-bit parameters. This forces it to use the recommended wrappers in <linux/math64.h> instead. Signed-off-by: Geoff Thorpe <Geoff.Thorpe@freescale.com> Change-Id: Id708cdad58593f38683112adf59984ffd3d763f7 Reviewed-on: http://git.am.freescale.net:8181/2580 Tested-by: Review Code-CDREVIEW <CDREVIEW@freescale.com> Reviewed-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
2013-05-17dpa_offload: Added macros for printing error messagesAurelian Zanoschi
Added log_err and log_warn based on pr_err and pr_warn for printing error and warning messages for DPA Offload components. Signed-off-by: Aurelian Zanoschi <Aurelian.Zanoschi@freescale.com> Change-Id: I50de023590efabbc73022eb67c694fc4f6ff3d86 Reviewed-on: http://git.am.freescale.net:8181/2314 Reviewed-by: Chereji Marian-Cornel-R27762 <marian.chereji@freescale.com> Reviewed-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com> Tested-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
2013-05-15Fix USDPAA IRQ handling to work correctly in 32 bit modeRoy Pledge
Also properly respect the NONBLOCK flag in the read() API Make sure IRQ is properly inhibited in the IRQ handler Create a dependancy between USDPAA and USDPAA_IRQ file pointers Change-Id: Idc0c33c8448a402d5d127e7e4e22e629dbfa5912 Signed-off-by: Roy Pledge <Roy.Pledge@freescale.com> Reviewed-on: http://git.am.freescale.net:8181/2310 Reviewed-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com> Tested-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
2013-05-11drm/radeon: add new richland pci idsAlex Deucher
commit 62d1f92e06aef9665d71ca7e986b3047ecf0b3c7 upstream. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-11drm/radeon: add some new SI PCI idsAlex Deucher
commit 18932a28419596bc9403770f5d8a108c5433fe59 upstream. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-11drm/prime: keep a reference from the handle to exported dma-buf (v6)Dave Airlie
commit 219b47339ced80ca580bb6ce7d1636166984afa7 upstream. Currently we have a problem with this: 1. i915: create gem object 2. i915: export gem object to prime 3. radeon: import gem object 4. close prime fd 5. radeon: unref object 6. i915: unref object i915 has an imported object reference in its file priv, that isn't cleaned up properly until fd close. The reference gets added at step 2, but at step 6 we don't have enough info to clean it up. The solution is to take a reference on the dma-buf when we export it, and drop the reference when the gem handle goes away. So when we export a dma_buf from a gem object, we keep track of it with the handle, we take a reference to the dma_buf. When we close the handle (i.e. userspace is finished with the buffer), we drop the reference to the dma_buf, and it gets collected. This patch isn't meant to fix any other problem or bikesheds, and it doesn't fix any races with other scenarios. v1.1: move export symbol line back up. v2: okay I had to do a bit more, as the first patch showed a leak on one of my tests, that I found using the dma-buf debugfs support, the problem case is exporting a buffer twice with the same handle, we'd add another export handle for it unnecessarily, however we now fail if we try to export the same object with a different gem handle, however I'm not sure if that is a case I want to support, and I've gotten the code to WARN_ON if we hit something like that. v2.1: rebase this patch, write better commit msg. v3: cleanup error handling, track import vs export in linked list, these two patches were separate previously, but seem to work better like this. v4: danvet is correct, this code is no longer useful, since the buffer better exist, so remove it. v5: always take a reference to the dma buf object, import or export. (Imre Deak contributed this originally) v6: square the circle, remove import vs export tracking now that there is no difference Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-11block: fix max discard sectors limitJames Bottomley
commit 871dd9286e25330c8a581e5dacfa8b1dfe1dd641 upstream. linux-v3.8-rc1 and later support for plug for blkdev_issue_discard with commit 0cfbcafcae8b7364b5fa96c2b26ccde7a3a296a9 (block: add plug for blkdev_issue_discard ) For example, 1) DISCARD rq-1 with size size 4GB 2) DISCARD rq-2 with size size 1GB If these 2 discard requests get merged, final request size will be 5GB. In this case, request's __data_len field may overflow as it can store max 4GB(unsigned int). This issue was observed while doing mkfs.f2fs on 5GB SD card: https://lkml.org/lkml/2013/4/1/292 Info: sector size = 512 Info: total sectors = 11370496 (in 512bytes) Info: zone aligned segment0 blkaddr: 512 [ 257.789764] blk_update_request: bio idx 0 >= vcnt 0 mkfs process gets stuck in D state and I see the following in the dmesg: [ 257.789733] __end_that: dev mmcblk0: type=1, flags=122c8081 [ 257.789764] sector 4194304, nr/cnr 2981888/4294959104 [ 257.789764] bio df3840c0, biotail df3848c0, buffer (null), len 1526726656 [ 257.789764] blk_update_request: bio idx 0 >= vcnt 0 [ 257.794921] request botched: dev mmcblk0: type=1, flags=122c8081 [ 257.794921] sector 4194304, nr/cnr 2981888/4294959104 [ 257.794921] bio df3840c0, biotail df3848c0, buffer (null), len 1526726656 This patch fixes this issue. Reported-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Tested-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-11hugetlbfs: fix mmap failure in unaligned size requestNaoya Horiguchi
commit af73e4d9506d3b797509f3c030e7dcd554f7d9c4 upstream. The current kernel returns -EINVAL unless a given mmap length is "almost" hugepage aligned. This is because in sys_mmap_pgoff() the given length is passed to vm_mmap_pgoff() as it is without being aligned with hugepage boundary. This is a regression introduced in commit 40716e29243d ("hugetlbfs: fix alignment of huge page requests"), where alignment code is pushed into hugetlb_file_setup() and the variable len in caller side is not changed. To fix this, this patch partially reverts that commit, and adds alignment code in caller side. And it also introduces hstate_sizelog() in order to get proper hstate to specified hugepage size. Addresses https://bugzilla.kernel.org/show_bug.cgi?id=56881 [akpm@linux-foundation.org: fix warning when CONFIG_HUGETLB_PAGE=n] Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-by: <iceman_dvd@yahoo.com> Cc: Steven Truelove <steven.truelove@utoronto.ca> Cc: Jianguo Wu <wujianguo@huawei.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-10Subject: [PATCH v5] Support for dummy vfio iommu driver.Varun Sethi
This is primarily required for supporting vfio direct device assignment on platforms that don't have a hardware IOMMU. Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com> Change-Id: Id3e968a7c4110fe392354b053112e2dc083adebd Reviewed-on: http://git.am.freescale.net:8181/2401 Reviewed-by: Wood Scott-B07421 <scottwood@freescale.com> Reviewed-by: Yoder Stuart-B08248 <stuart.yoder@freescale.com> Reviewed-by: Bhushan Bharat-R65777 <Bharat.Bhushan@freescale.com> Reviewed-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com> Tested-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
2013-05-08jbd2: fix race between jbd2_journal_remove_checkpoint and ->j_commit_callbackDmitry Monakhov
commit 794446c6946513c684d448205fbd76fa35f38b72 upstream. The following race is possible: [kjournald2] other_task jbd2_journal_commit_transaction() j_state = T_FINISHED; spin_unlock(&journal->j_list_lock); ->jbd2_journal_remove_checkpoint() ->jbd2_journal_free_transaction(); ->kmem_cache_free(transaction) ->j_commit_callback(journal, transaction); -> USE_AFTER_FREE WARNING: at lib/list_debug.c:62 __list_del_entry+0x1c0/0x250() Hardware name: list_del corruption. prev->next should be ffff88019a4ec198, but was 6b6b6b6b6b6b6b6b Modules linked in: cpufreq_ondemand acpi_cpufreq freq_table mperf coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode sg xhci_hcd button sd_mod crc_t10dif aesni_intel ablk_helper cryptd lrw aes_x86_64 xts gf128mul ahci libahci pata_acpi ata_generic dm_mirror dm_region_hash dm_log dm_mod Pid: 16400, comm: jbd2/dm-1-8 Tainted: G W 3.8.0-rc3+ #107 Call Trace: [<ffffffff8106fb0d>] warn_slowpath_common+0xad/0xf0 [<ffffffff8106fc06>] warn_slowpath_fmt+0x46/0x50 [<ffffffff813637e9>] ? ext4_journal_commit_callback+0x99/0xc0 [<ffffffff8148cae0>] __list_del_entry+0x1c0/0x250 [<ffffffff813637bf>] ext4_journal_commit_callback+0x6f/0xc0 [<ffffffff813ca336>] jbd2_journal_commit_transaction+0x23a6/0x2570 [<ffffffff8108aa42>] ? try_to_del_timer_sync+0x82/0xa0 [<ffffffff8108b491>] ? del_timer_sync+0x91/0x1e0 [<ffffffff813d3ecf>] kjournald2+0x19f/0x6a0 [<ffffffff810ad630>] ? wake_up_bit+0x40/0x40 [<ffffffff813d3d30>] ? bit_spin_lock+0x80/0x80 [<ffffffff810ac6be>] kthread+0x10e/0x120 [<ffffffff810ac5b0>] ? __init_kthread_worker+0x70/0x70 [<ffffffff818ff6ac>] ret_from_fork+0x7c/0xb0 [<ffffffff810ac5b0>] ? __init_kthread_worker+0x70/0x70 In order to demonstrace this issue one should mount ext4 with mount -o discard option on SSD disk. This makes callback longer and race window becomes wider. In order to fix this we should mark transaction as finished only after callbacks have completed Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-08ext4/jbd2: don't wait (forever) for stale tid caused by wraparoundTheodore Ts'o
commit d76a3a77113db020d9bb1e894822869410450bd9 upstream. In the case where an inode has a very stale transaction id (tid) in i_datasync_tid or i_sync_tid, it's possible that after a very large (2**31) number of transactions, that the tid number space might wrap, causing tid_geq()'s calculations to fail. Commit deeeaf13 "jbd2: fix fsync() tid wraparound bug", later modified by commit e7b04ac0 "jbd2: don't wake kjournald unnecessarily", attempted to fix this problem, but it only avoided kjournald spinning forever by fixing the logic in jbd2_log_start_commit(). Unfortunately, in the codepaths in fs/ext4/fsync.c and fs/ext4/inode.c that might call jbd2_log_start_commit() with a stale tid, those functions will subsequently call jbd2_log_wait_commit() with the same stale tid, and then wait for a very long time. To fix this, we replace the calls to jbd2_log_start_commit() and jbd2_log_wait_commit() with a call to a new function, jbd2_complete_transaction(), which will correctly handle stale tid's. As a bonus, jbd2_complete_transaction() will avoid locking j_state_lock for writing unless a commit needs to be started. This should have a small (but probably not measurable) improvement for ext4's scalability. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reported-by: Ben Hutchings <ben@decadent.org.uk> Reported-by: George Barnett <gbarnett@atlassian.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-08ipc: sysv shared memory limited to 8TiBRobin Holt
commit d69f3bad4675ac519d41ca2b11e1c00ca115cecd upstream. Trying to run an application which was trying to put data into half of memory using shmget(), we found that having a shmall value below 8EiB-8TiB would prevent us from using anything more than 8TiB. By setting kernel.shmall greater than 8EiB-8TiB would make the job work. In the newseg() function, ns->shm_tot which, at 8TiB is INT_MAX. ipc/shm.c: 458 static int newseg(struct ipc_namespace *ns, struct ipc_params *params) 459 { ... 465 int numpages = (size + PAGE_SIZE -1) >> PAGE_SHIFT; ... 474 if (ns->shm_tot + numpages > ns->shm_ctlall) 475 return -ENOSPC; [akpm@linux-foundation.org: make ipc/shm.c:newseg()'s numpages size_t, not int] Signed-off-by: Robin Holt <holt@sgi.com> Reported-by: Alex Thorlton <athorlton@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-08exec: do not abuse ->cred_guard_mutex in threadgroup_lock()Oleg Nesterov
commit e56fb2874015370e3b7f8d85051f6dce26051df9 upstream. threadgroup_lock() takes signal->cred_guard_mutex to ensure that thread_group_leader() is stable. This doesn't look nice, the scope of this lock in do_execve() is huge. And as Dave pointed out this can lead to deadlock, we have the following dependencies: do_execve: cred_guard_mutex -> i_mutex cgroup_mount: i_mutex -> cgroup_mutex attach_task_by_pid: cgroup_mutex -> cred_guard_mutex Change de_thread() to take threadgroup_change_begin() around the switch-the-leader code and change threadgroup_lock() to avoid ->cred_guard_mutex. Note that de_thread() can't sleep with ->group_rwsem held, this can obviously deadlock with the exiting leader if the writer is active, so it does threadgroup_change_end() before schedule(). Reported-by: Dave Jones <davej@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-08cgroup: fix broken file xattrsLi Zefan
commit 712317ad97f41e738e1a19aa0a6392a78a84094e upstream. We should store file xattrs in struct cfent instead of struct cftype, because cftype is a type while cfent is object instance of cftype. For example each cgroup has a tasks file, and each tasks file is associated with a uniq cfent, but all those files share the same struct cftype. Alexey Kodanev reported a crash, which can be reproduced: # mount -t cgroup -o xattr /sys/fs/cgroup # mkdir /sys/fs/cgroup/test # setfattr -n trusted.value -v test_value /sys/fs/cgroup/tasks # rmdir /sys/fs/cgroup/test # umount /sys/fs/cgroup oops! In this case, simple_xattrs_free() will free the same struct simple_xattrs twice. tj: Dropped unused local variable @cft from cgroup_diput(). Reported-by: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-08ALSA: emu10k1: Fix dock firmware loadingTakashi Iwai
commit e08b34e86dfdb72a62196ce0f03d33f48958d8b9 upstream. The commit [b209c4df: ALSA: emu10k1: cache emu1010 firmware] broke the firmware loading of the dock, just (mistakenly) ignoring a different firmware for docks on some models. This patch revives them again. Bugzilla: https://bugs.archlinux.org/task/34865 Reported-and-tested-by: Tobias Powalowski <tobias.powalowski@googlemail.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-08mm: allow arch code to control the user page table ceilingHugh Dickins
commit 6ee8630e02be6dd89926ca0fbc21af68b23dc087 upstream. On architectures where a pgd entry may be shared between user and kernel (e.g. ARM+LPAE), freeing page tables needs a ceiling other than 0. This patch introduces a generic USER_PGTABLES_CEILING that arch code can override. It is the responsibility of the arch code setting the ceiling to ensure the complete freeing of the page tables (usually in pgd_free()). [catalin.marinas@arm.com: commit log; shift_arg_pages(), asm-generic/pgtables.h changes] Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Russell King <linux@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-07Merge remote-tracking branch 'fslkvm/for-sdk1.4' into verifyAndy Fleming
2013-05-06powerpc/sata: add workaround for erratum A-005636Shaohui Xie
When NCQ is enabled, the SATA controller does not support DMA setup FIS with autoactivate enabled from the device. The SATA host may timeout without finishing the transaction. This will have a minor performance impact as disabling the auto-activate feature requires the device to send a DMA setup as well as a DMA activate FIS to enable reception of the first data FIS. Software will set a flag and check if device enables DMA AA by default, then driver will: 1. Disable the DMA setup auto-activate feature by a set features command. 2. if 1# fail, then disable NCQ by setting the queue depth to one. Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com> Change-Id: Icd5b9811eb4cb8342532c5c6238282b6c42cc7fa Reviewed-on: http://git.am.freescale.net:8181/2226 Reviewed-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com> Tested-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
2013-05-02kvm: destroy emulated devices on VM exitScott Wood
The hassle of getting refcounting right was greater than the hassle of keeping a list of devices to destroy on VM exit. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2013-05-02kvm/ppc/mpic: add KVM_CAP_IRQ_MPICScott Wood
Enabling this capability connects the vcpu to the designated in-kernel MPIC. Using explicit connections between vcpus and irqchips allows for flexibility, but the main benefit at the moment is that it simplifies the code -- KVM doesn't need vm-global state to remember which MPIC object is associated with this vm, and it doesn't need to care about ordering between irqchip creation and vcpu creation. Signed-off-by: Scott Wood <scottwood@freescale.com> [agraf: add stub functions for kvmppc_mpic_{dis,}connect_vcpu] Signed-off-by: Alexander Graf <agraf@suse.de>
2013-05-02kvm/ppc/mpic: in-kernel MPIC emulationScott Wood
Hook the MPIC code up to the KVM interfaces, add locking, etc. Signed-off-by: Scott Wood <scottwood@freescale.com> [agraf: add stub function for kvmppc_mpic_set_epr, non-booke, 64bit] Signed-off-by: Alexander Graf <agraf@suse.de>
2013-05-02kvm: add device control APIScott Wood
Currently, devices that are emulated inside KVM are configured in a hardcoded manner based on an assumption that any given architecture only has one way to do it. If there's any need to access device state, it is done through inflexible one-purpose-only IOCTLs (e.g. KVM_GET/SET_LAPIC). Defining new IOCTLs for every little thing is cumbersome and depletes a limited numberspace. This API provides a mechanism to instantiate a device of a certain type, returning an ID that can be used to set/get attributes of the device. Attributes may include configuration parameters (e.g. register base address), device state, operational commands, etc. It is similar to the ONE_REG API, except that it acts on devices rather than vcpus. Both device types and individual attributes can be tested without having to create the device or get/set the attribute, without the need for separately managing enumerated capabilities. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Conflicts: include/linux/kvm_host.h include/uapi/linux/kvm.h Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
2013-05-02KVM: Move irq routing setup to irqchip.cAlexander Graf
Setting up IRQ routes is nothing IOAPIC specific. Extract everything that really is generic code into irqchip.c and only leave the ioapic specific bits to irq_comm.c. Signed-off-by: Alexander Graf <agraf@suse.de> Acked-by: Michael S. Tsirkin <mst@redhat.com> Conflicts: virt/kvm/irq_comm.c Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
2013-05-02KVM: Extract generic irqchip logic into irqchip.cAlexander Graf
The current irq_comm.c file contains pieces of code that are generic across different irqchip implementations, as well as code that is fully IOAPIC specific. Split the generic bits out into irqchip.c. Signed-off-by: Alexander Graf <agraf@suse.de> Acked-by: Michael S. Tsirkin <mst@redhat.com> Conflicts: virt/kvm/irq_comm.c Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
2013-05-02KVM: Remove kvm_get_intr_delivery_bitmaskAlexander Graf
The prototype has been stale for a while, I can't spot any real function define behind it. Let's just remove it. Signed-off-by: Alexander Graf <agraf@suse.de> Acked-by: Michael S. Tsirkin <mst@redhat.com> Conflicts: include/linux/kvm_host.h Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
2013-05-02KVM: Drop __KVM_HAVE_IOAPIC condition on irq routingAlexander Graf
We have a capability enquire system that allows user space to ask kvm whether a feature is available. The point behind this system is that we can have different kernel configurations with different capabilities and user space can adjust accordingly. Because features can always be non existent, we can drop any #ifdefs on CAP defines that could be used generically, like the irq routing bits. These can be easily reused for non-IOAPIC systems as well. Signed-off-by: Alexander Graf <agraf@suse.de> Acked-by: Michael S. Tsirkin <mst@redhat.com>
2013-05-02KVM: Introduce CONFIG_HAVE_KVM_IRQ_ROUTINGAlexander Graf
Quite a bit of code in KVM has been conditionalized on availability of IOAPIC emulation. However, most of it is generically applicable to platforms that don't have an IOPIC, but a different type of irq chip. Make code that only relies on IRQ routing, not an APIC itself, on CONFIG_HAVE_KVM_IRQ_ROUTING, so that we can reuse it later. Signed-off-by: Alexander Graf <agraf@suse.de> Acked-by: Michael S. Tsirkin <mst@redhat.com>
2013-05-02KVM: Add KVM_IRQCHIP_NUM_PINS in addition to KVM_IOAPIC_NUM_PINSAlexander Graf
The concept of routing interrupt lines to an irqchip is nothing that is IOAPIC specific. Every irqchip has a maximum number of pins that can be linked to irq lines. So let's add a new define that allows us to reuse generic code for non-IOAPIC platforms. Signed-off-by: Alexander Graf <agraf@suse.de> Acked-by: Michael S. Tsirkin <mst@redhat.com>
2013-05-02KVM: PPC: revert the MPIC and irqfd patchesStuart Yoder
There are important bug fixes in the upstream patches that got applied. This revert commit removes the old patches to prepare for the new set. ------------------------------------------------------------------ Revert "KVM: PPC: MPIC: Restrict to e500 platforms" This reverts commit 540983d28a0dbe60b8bd08d96448b6544ad60293. Revert "KVM: PPC: MPIC: Add support for KVM_IRQ_LINE" This reverts commit ce5692ad437dfa0e7eb35918c40cdbd119b50909. Revert "KVM: PPC: Support irq routing and irqfd for in-kernel MPIC" This reverts commit 4ad8621c44d1420090241eaa8d5594e72ae6e05f. Revert "KVM: Move irqfd resample cap handling to generic code" This reverts commit e5557be2f787cff8f4daab4b39d38b5822f41390. Revert "KVM: Move irq routing setup to irqchip.c" This reverts commit 4486cf9a7a4d823c43b54e1b6280f41bcc59022d. Revert "KVM: Extract generic irqchip logic into irqchip.c" This reverts commit 0028971f3b4251cc231989dd9570f426d8472a2b. Revert "KVM: Move irq routing to generic code" This reverts commit e144029a9451b391afcbe123d1895523654e2bc5. Revert "KVM: Remove kvm_get_intr_delivery_bitmask" This reverts commit a7e10a68a247bbcde502947cf2fd4a7722c30512. Revert "KVM: Drop __KVM_HAVE_IOAPIC condition on irq routing" This reverts commit 38deef57a0eef339636c8fb8b8207a5a24e241ad. Revert "KVM: Introduce CONFIG_HAVE_KVM_IRQ_ROUTING" This reverts commit 3b196a30f460bf0f1bd6737266ed479a597a5b58. Revert "KVM: Add KVM_IRQCHIP_NUM_PINS in addition to KVM_IOAPIC_NUM_PINS" This reverts commit 40602a78a8a12d17ca895ea0a2721441c155801f. Revert "kvm/ppc/mpic: add KVM_CAP_IRQ_MPIC" This reverts commit 36e75cd4b65376dbee78ad92b084f9428868fe29. Revert "kvm/ppc/mpic: in-kernel MPIC emulation" This reverts commit 13ded2807a22aceff940ca1e282897e36fb0ba47. Revert "kvm/ppc/mpic: adapt to kernel style and environment" This reverts commit ac811c14ba3229ecbd39f3dc0b7f4c43ded36308. Revert "kvm/ppc/mpic: remove some obviously unneeded code" This reverts commit a60b865ac8d5cf3bff1e4a7dd43b6d8aaed87391. Revert "kvm/ppc/mpic: import hw/openpic.c from QEMU" This reverts commit 256cdf3f6df0561883ce801fd29595ecd031209a. Revert "kvm: add device control API" This reverts commit 8c848b9ed8b15aaccfb54511b22b205afc14f2d6. ------------------------------------------------------------------ Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
2013-05-02fsl/usb: Workaround for USB erratum-A005275Ramneek Mehresh
Incoming packets are randomly corrucpted by h/w resulting in varying errors. This workaround makes FS as default mode in all affected socs by - Disabling HS chirp signalling - Forcing EPS field of all packets to FS This errata does not affect FS mode. Forces all HS devices to connect in FS mode for all socs affected by this erratum: P3041 and P2041 rev 1.0 and 1.1 P5020 and P5010 rev 1.0 and 2.0 P5040 and P1010 rev 1.0 Workaround can be disabled by mentioning "no_erratum_a005275" in hwconfig string (in u-boot command line) Signed-off-by: Ramneek Mehresh <ramneek.mehresh@freescale.com> Change-Id: Ie7b75b033220e4be44b5c769d7c187928d84dd6d Reviewed-on: http://git.am.freescale.net:8181/1435 Reviewed-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com> Tested-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
2013-05-01Merge branch 'rtmerge'Scott Wood
Conflicts: include/linux/preempt.h
2013-05-01net: fix incorrect credentials passingLinus Torvalds
[ Upstream commit 83f1b4ba917db5dc5a061a44b3403ddb6e783494 ] Commit 257b5358b32f ("scm: Capture the full credentials of the scm sender") changed the credentials passing code to pass in the effective uid/gid instead of the real uid/gid. Obviously this doesn't matter most of the time (since normally they are the same), but it results in differences for suid binaries when the wrong uid/gid ends up being used. This just undoes that (presumably unintentional) part of the commit. Reported-by: Andy Lutomirski <luto@amacapital.net> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Serge E. Hallyn <serge@hallyn.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-01netfilter: don't reset nf_trace in nf_reset()Patrick McHardy
[ Upstream commit 124dff01afbdbff251f0385beca84ba1b9adda68 ] Commit 130549fe ("netfilter: reset nf_trace in nf_reset") added code to reset nf_trace in nf_reset(). This is wrong and unnecessary. nf_reset() is used in the following cases: - when passing packets up the the socket layer, at which point we want to release all netfilter references that might keep modules pinned while the packet is queued. nf_trace doesn't matter anymore at this point. - when encapsulating or decapsulating IPsec packets. We want to continue tracing these packets after IPsec processing. - when passing packets through virtual network devices. Only devices on that encapsulate in IPv4/v6 matter since otherwise nf_trace is not used anymore. Its not entirely clear whether those packets should be traced after that, however we've always done that. - when passing packets through virtual network devices that make the packet cross network namespace boundaries. This is the only cases where we clearly want to reset nf_trace and is also what the original patch intended to fix. Add a new function nf_reset_trace() and use it in dev_forward_skb() to fix this properly. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-01net: count hw_addr syncs so that unsync works properly.Vlad Yasevich
[ Upstream commit 4543fbefe6e06a9e40d9f2b28d688393a299f079 ] A few drivers use dev_uc_sync/unsync to synchronize the address lists from master down to slave/lower devices. In some cases (bond/team) a single address list is synched down to multiple devices. At the time of unsync, we have a leak in these lower devices, because "synced" is treated as a boolean and the address will not be unsynced for anything after the first device/call. Treat "synced" as a count (same as refcount) and allow all unsync calls to work. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-30dpa_stats: Updated Traffic Manager counter supportAurelian Zanoschi
Updated the API for the Traffic Manager counter according to the implementation from QMan. Signed-off-by: Aurelian Zanoschi <Aurelian.Zanoschi@freescale.com> Change-Id: I3e0985d4dc402ba59754cec762fdd3a9210e938a Reviewed-on: http://git.am.freescale.net:8181/2242 Reviewed-by: Floarea Anca Jeanina-B12569 <anca.floarea@freescale.com> Reviewed-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com> Tested-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
2013-04-30dpa_stats: Add support for NCSP IP reassembly counterAurelian Zanoschi
Added support for non consistent storage profile reassembly counter for FMANv3 capable platforms. For non FMANv3 platforms the driver will accept the stat selection, but will always return 0 Signed-off-by: Aurelian Zanoschi <Aurelian.Zanoschi@freescale.com> Change-Id: I27501de84499c1db5085510eb7c320709786617e Reviewed-on: http://git.am.freescale.net:8181/2241 Reviewed-by: Floarea Anca Jeanina-B12569 <anca.floarea@freescale.com> Reviewed-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com> Tested-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
2013-04-30dpa_offload: Add DPA Stats componentJeanina Floarea
The purpose of the DPA Stats module is to provide to the application a unitary method for retrieving counters that are spread at different hardware or software locations. Signed-off-by: Anca Jeanina FLOAREA <anca.floarea@freescale.com> Change-Id: I3b4d886ef5aab00f6de6a330e068b7401bc24b6c Reviewed-on: http://git.am.freescale.net:8181/2237 Reviewed-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com> Tested-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
2013-04-30dpa_offload: Add DPA IPsec componentAndrei Varvara
The DPA IPSec component exports a set of functions used to: - initialize the DPA IPSec module internal data structures - create and configure full inbound IPSec hardware accelerated paths - create and configure full outbound IPSec hardware accelerated paths - replace expired SAs (after rekeying) without packet loss During the initialization phase the DPA IPSec implementation performs a series of actions meant to remove the need to perform memory allocations and hardware/software object initializations during the runtime phases. INBOUND PATH: _____________ ______________ _______________________________ __________ || SA lookup ||--> || Decryption || --> || Inbound Policy Verification || --> || App Rx || ||___________|| ||____________|| ||_____________________________|| ||________|| The inbound processing of an encrypted packet begins by determining the SA that will be used for decryption and authentication. In accordance to the RFC the packets are classified based on a 3-tuple that uniquely identifies the SA. This 3-tuple is formed from: - the destination IP address in the IP header of the encrypted packet - the value of the IP protocol field in the IP header of the encrypted packet - the value in the SPI field in the ESP header A special case is that were the encrypted packets are encapsulated in an UDP header in order to support NAT traversal. In this case the classification key should contain the following fields: - the destination IP address in the IP header of the encrypted packet - the IP protocol field in the IP header of the encrypted packet - the SPI field in the ESP header - the source UDP port in the UDP header - the destination UDP port in the UDP header This lookup is offloaded to FMAN by means of classifier API and FMAN API. When an encrypted packet matches an offloaded key, it is directed by the hardware into the decryption process by enquing the packet to a SEC frame queue (FQ). A shared descriptor (representing the decryption SA) is set on this FQ and the SEC will begin the decryption process and then place the clear text packet to a FQ that is input for an offline port (OH). Processing continues with inbound policy verification done on OH using the FMAN hardware. After this step the packet is enqueued to a FQ created by the application which benefits of IPSec security OUTBOUND PATH: __________ _________________ ______________ __________________ || App Tx || --> || Policy Lookup || --> || Encryption || --> || Error Checking || ||________|| ||_______________|| ||____________|| ||________________|| The primary function of the policy lookup block is to classify frames and determine the correct SA on which they should be processed. The DPA IPSec can be configured to build a policy key using any subset of the following fields: - masked source IP address - masked destination IP address - optionally masked IP protocol field value - masked source port value / ICMP type field value - masked destination port value / ICMP code field value IP fragmentation can be configured per policy and is performed, if required, on the packets before being sent to the Encryption Block. A fragmentation header manipulation identifier has to be passed when offloading the policy. If a clear text packet hits an offloaded policy the packet will be directed by FMAN hardware into the proper FQ for SEC processing. After the SEC has completed all the required operations, a new frame is created containing the ESP encapsulated packet. This frame will be sent to the next block for further processing i.e input to offline port where error checking is done prior to forwarding the packet to an application desired FQ based on the SA that processed that packet. Signed-off-by: Andrei Varvara <andrei.varvara@freescale.com> Signed-off-by: Mihai Serb Change-Id: Id8a4afa1cfda42dd2ba1408614a5900cb7b80cee Reviewed-on: http://git.am.freescale.net:8181/2235 Reviewed-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com> Tested-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
2013-04-30dpa_offload: Add packet classification componentMarian Chereji
The packet classification offloading driver implements its functionalities using the NetCommSw FMan driver. It exposes an easy-to-use API which can be called either from kernel space or from user space (through the classification offloading driver wrapper). It is able to create or import (from FMD resources) 3 types of tables - exact match table, indexed table and HASH table. Imported tables can be either empty or prefilled. It offers an API to insert, remove, or modify entries. It allows the users to classify and - enqueue - multicast - discard - re-classify or - return-to-KeyGen network packets. It is able to create or import (from FMD resources) header manipulation operations and attach them to table entries. It allows runtime modification of existing (created or imported) header manipulation operations. It offers an API to create or import (from FMD resources) multicast groups. It allows the user to add or remove members to existing multicast groups. Signed-off-by: Marian Chereji <marian.chereji@freescale.com> Signed-off-by: Radu Bulie <radu.bulie@freescale.com> Change-Id: I854c0c3c2eba6d6f441cb46e502e6dbc623c48d5 Reviewed-on: http://git.am.freescale.net:8181/2233 Reviewed-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com> Tested-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
2013-04-30completion: Use simple wait queuesThomas Gleixner
Completions have no long lasting callbacks and therefor do not need the complex waitqueue variant. Use simple waitqueues which reduces the contention on the waitqueue lock. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-04-30wait-simple: Rework for use with completionsThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-04-30sched: Consider pi boosting in setschedulerThomas Gleixner
If a PI boosted task policy/priority is modified by a setscheduler() call we unconditionally dequeue and requeue the task if it is on the runqueue even if the new priority is lower than the current effective boosted priority. This can result in undesired reordering of the priority bucket list. If the new priority is less or equal than the current effective we just store the new parameters in the task struct and leave the scheduler class and the runqueue untouched. This is handled when the task deboosts itself. Only if the new priority is higher than the effective boosted priority we apply the change immediately. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Cc: stable-rt@vger.kernel.org
2013-04-30wait-simple: Simple waitqueue implementationThomas Gleixner
wait_queue is a swiss army knife and in most of the cases the complexity is not needed. For RT waitqueues are a constant source of trouble as we can't convert the head lock to a raw spinlock due to fancy and long lasting callbacks. Provide a slim version, which allows RT to replace wait queues. This should go mainline as well, as it lowers memory consumption and runtime overhead. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-04-30net: make devnet_rename_seq a mutexSebastian Andrzej Siewior
On RT write_seqcount_begin() disables preemption and device_rename() allocates memory with GFP_KERNEL and grabs later the sysfs_mutex mutex. Since I don't see a reason why this can't be a mutex, make it one. We probably don't have that much reads at the same time in the hot path. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2013-04-30sched: Add support for lazy preemptionThomas Gleixner
It has become an obsession to mitigate the determinism vs. throughput loss of RT. Looking at the mainline semantics of preemption points gives a hint why RT sucks throughput wise for ordinary SCHED_OTHER tasks. One major issue is the wakeup of tasks which are right away preempting the waking task while the waking task holds a lock on which the woken task will block right after having preempted the wakee. In mainline this is prevented due to the implicit preemption disable of spin/rw_lock held regions. On RT this is not possible due to the fully preemptible nature of sleeping spinlocks. Though for a SCHED_OTHER task preempting another SCHED_OTHER task this is really not a correctness issue. RT folks are concerned about SCHED_FIFO/RR tasks preemption and not about the purely fairness driven SCHED_OTHER preemption latencies. So I introduced a lazy preemption mechanism which only applies to SCHED_OTHER tasks preempting another SCHED_OTHER task. Aside of the existing preempt_count each tasks sports now a preempt_lazy_count which is manipulated on lock acquiry and release. This is slightly incorrect as for lazyness reasons I coupled this on migrate_disable/enable so some other mechanisms get the same treatment (e.g. get_cpu_light). Now on the scheduler side instead of setting NEED_RESCHED this sets NEED_RESCHED_LAZY in case of a SCHED_OTHER/SCHED_OTHER preemption and therefor allows to exit the waking task the lock held region before the woken task preempts. That also works better for cross CPU wakeups as the other side can stay in the adaptive spinning loop. For RT class preemption there is no change. This simply sets NEED_RESCHED and forgoes the lazy preemption counter. Initial test do not expose any observable latency increasement, but history shows that I've been proven wrong before :) The lazy preemption mode is per default on, but with CONFIG_SCHED_DEBUG enabled it can be disabled via: # echo NO_PREEMPT_LAZY >/sys/kernel/debug/sched_features and reenabled via # echo PREEMPT_LAZY >/sys/kernel/debug/sched_features The test results so far are very machine and workload dependent, but there is a clear trend that it enhances the non RT workload performance. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-04-30net: netfilter: Serialize xt_write_recseq sections on RTThomas Gleixner
The netfilter code relies only on the implicit semantics of local_bh_disable() for serializing wt_write_recseq sections. RT breaks that and needs explicit serialization here. Reported-by: Peter LaDow <petela@gocougs.wsu.edu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable-rt@vger.kernel.org
2013-04-30mm: Enable SLUB for RTThomas Gleixner
Make SLUB RT aware and remove the restriction in Kconfig. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>