Age | Commit message (Collapse) | Author |
|
Signed-off-by: Mandy Lavi <mandy.lavi@freescale.com>
- PFC
Adjustments for PFC configuration constraints and limitations related to
port prefetch mode
- workaround ucode issues
Fix the following HW erratas regarding discard/error frames on V3:
FM_OP_NO_VSP_NO_RELEASE_ERRATA_FMAN_A006675 -
Description: OP without VSP will cause buffer leaks when instructed to discard a frame.
Workaround: FW will release the buffers.
FM_ERROR_VSP_NO_MATCH_SW006 -
Description: Any port with VSP enabled and multiple VSPs are configured
on this port can cause a situation where an error frame will
be enqueued to the error queue not with the default VSP.
Workaround: FW will replaced the current VSP with the default VSP
just before the frame is being enqueued to the error queue.
- Chosen-node new parameter support
errors-to-discard
Usage: optional
Value type: <u32>
Definition: Specifies which errors should be discarded.
Errors that are not in the mask, will not be discarded;
I.e. those errors will be enqueued and sent to the default error queue.
Change-Id: Ib468c67de88376e17d9c39ab5a0c8fc5b33b7b82
Reviewed-on: http://git.am.freescale.net:8181/2605
Tested-by: Review Code-CDREVIEW <CDREVIEW@freescale.com>
Reviewed-by: Bucur Madalin-Cristian-B32716 <madalin.bucur@freescale.com>
Reviewed-by: Garg Vakul-B16394 <vakul@freescale.com>
Reviewed-by: Radulescu Ruxandra Ioana-B05472 <ruxandra.radulescu@freescale.com>
Reviewed-by: Chereji Marian-Cornel-R27762 <marian.chereji@freescale.com>
Reviewed-by: Wang Haiying-R54964 <Haiying.Wang@freescale.com>
Reviewed-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
|
|
Patch enhances the performance for forwarded traffic by re-using the
skb in rx ring instead of allocating a new skb. Changes are done
under RX_TX_BUFF_XCHG patch which gets enabled for ASF enabled kernel only.
Patch also removes per-cpu variables used in RX_TX_BUFF_XCHG patch, replacing
with parameters in skbuff structures to enhance the performance.
Change-Id: I2b3c1ec80fe3ef21ade9ce881d5cb86695169518
Signed-off-by: Rajan Gupta <rajan.gupta@freescale.com>
Reviewed-on: http://git.am.freescale.net:8181/2678
Tested-by: Review Code-CDREVIEW <CDREVIEW@freescale.com>
Reviewed-by: Manoil Claudiu-B08782 <claudiu.manoil@freescale.com>
Reviewed-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
|
|
This is the 3.8.13 stable release
|
|
In 32-bit builds, the ROUNDING macro was breaking kernel assumptions when
fed with 64-bit parameters. This forces it to use the recommended
wrappers in <linux/math64.h> instead.
Signed-off-by: Geoff Thorpe <Geoff.Thorpe@freescale.com>
Change-Id: Id708cdad58593f38683112adf59984ffd3d763f7
Reviewed-on: http://git.am.freescale.net:8181/2580
Tested-by: Review Code-CDREVIEW <CDREVIEW@freescale.com>
Reviewed-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
|
|
Added log_err and log_warn based on pr_err and pr_warn for printing
error and warning messages for DPA Offload components.
Signed-off-by: Aurelian Zanoschi <Aurelian.Zanoschi@freescale.com>
Change-Id: I50de023590efabbc73022eb67c694fc4f6ff3d86
Reviewed-on: http://git.am.freescale.net:8181/2314
Reviewed-by: Chereji Marian-Cornel-R27762 <marian.chereji@freescale.com>
Reviewed-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
Tested-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
|
|
Also properly respect the NONBLOCK flag in the read() API
Make sure IRQ is properly inhibited in the IRQ handler
Create a dependancy between USDPAA and USDPAA_IRQ file pointers
Change-Id: Idc0c33c8448a402d5d127e7e4e22e629dbfa5912
Signed-off-by: Roy Pledge <Roy.Pledge@freescale.com>
Reviewed-on: http://git.am.freescale.net:8181/2310
Reviewed-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
Tested-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
|
|
commit 62d1f92e06aef9665d71ca7e986b3047ecf0b3c7 upstream.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 18932a28419596bc9403770f5d8a108c5433fe59 upstream.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 219b47339ced80ca580bb6ce7d1636166984afa7 upstream.
Currently we have a problem with this:
1. i915: create gem object
2. i915: export gem object to prime
3. radeon: import gem object
4. close prime fd
5. radeon: unref object
6. i915: unref object
i915 has an imported object reference in its file priv, that isn't
cleaned up properly until fd close. The reference gets added at step 2,
but at step 6 we don't have enough info to clean it up.
The solution is to take a reference on the dma-buf when we export it,
and drop the reference when the gem handle goes away.
So when we export a dma_buf from a gem object, we keep track of it
with the handle, we take a reference to the dma_buf. When we close
the handle (i.e. userspace is finished with the buffer), we drop
the reference to the dma_buf, and it gets collected.
This patch isn't meant to fix any other problem or bikesheds, and it doesn't
fix any races with other scenarios.
v1.1: move export symbol line back up.
v2: okay I had to do a bit more, as the first patch showed a leak
on one of my tests, that I found using the dma-buf debugfs support,
the problem case is exporting a buffer twice with the same handle,
we'd add another export handle for it unnecessarily, however
we now fail if we try to export the same object with a different gem handle,
however I'm not sure if that is a case I want to support, and I've
gotten the code to WARN_ON if we hit something like that.
v2.1: rebase this patch, write better commit msg.
v3: cleanup error handling, track import vs export in linked list,
these two patches were separate previously, but seem to work better
like this.
v4: danvet is correct, this code is no longer useful, since the buffer
better exist, so remove it.
v5: always take a reference to the dma buf object, import or export.
(Imre Deak contributed this originally)
v6: square the circle, remove import vs export tracking now
that there is no difference
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 871dd9286e25330c8a581e5dacfa8b1dfe1dd641 upstream.
linux-v3.8-rc1 and later support for plug for blkdev_issue_discard with
commit 0cfbcafcae8b7364b5fa96c2b26ccde7a3a296a9
(block: add plug for blkdev_issue_discard )
For example,
1) DISCARD rq-1 with size size 4GB
2) DISCARD rq-2 with size size 1GB
If these 2 discard requests get merged, final request size will be 5GB.
In this case, request's __data_len field may overflow as it can store
max 4GB(unsigned int).
This issue was observed while doing mkfs.f2fs on 5GB SD card:
https://lkml.org/lkml/2013/4/1/292
Info: sector size = 512
Info: total sectors = 11370496 (in 512bytes)
Info: zone aligned segment0 blkaddr: 512
[ 257.789764] blk_update_request: bio idx 0 >= vcnt 0
mkfs process gets stuck in D state and I see the following in the dmesg:
[ 257.789733] __end_that: dev mmcblk0: type=1, flags=122c8081
[ 257.789764] sector 4194304, nr/cnr 2981888/4294959104
[ 257.789764] bio df3840c0, biotail df3848c0, buffer (null), len
1526726656
[ 257.789764] blk_update_request: bio idx 0 >= vcnt 0
[ 257.794921] request botched: dev mmcblk0: type=1, flags=122c8081
[ 257.794921] sector 4194304, nr/cnr 2981888/4294959104
[ 257.794921] bio df3840c0, biotail df3848c0, buffer (null), len
1526726656
This patch fixes this issue.
Reported-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Tested-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit af73e4d9506d3b797509f3c030e7dcd554f7d9c4 upstream.
The current kernel returns -EINVAL unless a given mmap length is
"almost" hugepage aligned. This is because in sys_mmap_pgoff() the
given length is passed to vm_mmap_pgoff() as it is without being aligned
with hugepage boundary.
This is a regression introduced in commit 40716e29243d ("hugetlbfs: fix
alignment of huge page requests"), where alignment code is pushed into
hugetlb_file_setup() and the variable len in caller side is not changed.
To fix this, this patch partially reverts that commit, and adds
alignment code in caller side. And it also introduces hstate_sizelog()
in order to get proper hstate to specified hugepage size.
Addresses https://bugzilla.kernel.org/show_bug.cgi?id=56881
[akpm@linux-foundation.org: fix warning when CONFIG_HUGETLB_PAGE=n]
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: <iceman_dvd@yahoo.com>
Cc: Steven Truelove <steven.truelove@utoronto.ca>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This is primarily required for supporting vfio direct device assignment on platforms
that don't have a hardware IOMMU.
Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com>
Change-Id: Id3e968a7c4110fe392354b053112e2dc083adebd
Reviewed-on: http://git.am.freescale.net:8181/2401
Reviewed-by: Wood Scott-B07421 <scottwood@freescale.com>
Reviewed-by: Yoder Stuart-B08248 <stuart.yoder@freescale.com>
Reviewed-by: Bhushan Bharat-R65777 <Bharat.Bhushan@freescale.com>
Reviewed-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
Tested-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
|
|
commit 794446c6946513c684d448205fbd76fa35f38b72 upstream.
The following race is possible:
[kjournald2] other_task
jbd2_journal_commit_transaction()
j_state = T_FINISHED;
spin_unlock(&journal->j_list_lock);
->jbd2_journal_remove_checkpoint()
->jbd2_journal_free_transaction();
->kmem_cache_free(transaction)
->j_commit_callback(journal, transaction);
-> USE_AFTER_FREE
WARNING: at lib/list_debug.c:62 __list_del_entry+0x1c0/0x250()
Hardware name:
list_del corruption. prev->next should be ffff88019a4ec198, but was 6b6b6b6b6b6b6b6b
Modules linked in: cpufreq_ondemand acpi_cpufreq freq_table mperf coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode sg xhci_hcd button sd_mod crc_t10dif aesni_intel ablk_helper cryptd lrw aes_x86_64 xts gf128mul ahci libahci pata_acpi ata_generic dm_mirror dm_region_hash dm_log dm_mod
Pid: 16400, comm: jbd2/dm-1-8 Tainted: G W 3.8.0-rc3+ #107
Call Trace:
[<ffffffff8106fb0d>] warn_slowpath_common+0xad/0xf0
[<ffffffff8106fc06>] warn_slowpath_fmt+0x46/0x50
[<ffffffff813637e9>] ? ext4_journal_commit_callback+0x99/0xc0
[<ffffffff8148cae0>] __list_del_entry+0x1c0/0x250
[<ffffffff813637bf>] ext4_journal_commit_callback+0x6f/0xc0
[<ffffffff813ca336>] jbd2_journal_commit_transaction+0x23a6/0x2570
[<ffffffff8108aa42>] ? try_to_del_timer_sync+0x82/0xa0
[<ffffffff8108b491>] ? del_timer_sync+0x91/0x1e0
[<ffffffff813d3ecf>] kjournald2+0x19f/0x6a0
[<ffffffff810ad630>] ? wake_up_bit+0x40/0x40
[<ffffffff813d3d30>] ? bit_spin_lock+0x80/0x80
[<ffffffff810ac6be>] kthread+0x10e/0x120
[<ffffffff810ac5b0>] ? __init_kthread_worker+0x70/0x70
[<ffffffff818ff6ac>] ret_from_fork+0x7c/0xb0
[<ffffffff810ac5b0>] ? __init_kthread_worker+0x70/0x70
In order to demonstrace this issue one should mount ext4 with mount -o
discard option on SSD disk. This makes callback longer and race
window becomes wider.
In order to fix this we should mark transaction as finished only after
callbacks have completed
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d76a3a77113db020d9bb1e894822869410450bd9 upstream.
In the case where an inode has a very stale transaction id (tid) in
i_datasync_tid or i_sync_tid, it's possible that after a very large
(2**31) number of transactions, that the tid number space might wrap,
causing tid_geq()'s calculations to fail.
Commit deeeaf13 "jbd2: fix fsync() tid wraparound bug", later modified
by commit e7b04ac0 "jbd2: don't wake kjournald unnecessarily",
attempted to fix this problem, but it only avoided kjournald spinning
forever by fixing the logic in jbd2_log_start_commit().
Unfortunately, in the codepaths in fs/ext4/fsync.c and fs/ext4/inode.c
that might call jbd2_log_start_commit() with a stale tid, those
functions will subsequently call jbd2_log_wait_commit() with the same
stale tid, and then wait for a very long time. To fix this, we
replace the calls to jbd2_log_start_commit() and
jbd2_log_wait_commit() with a call to a new function,
jbd2_complete_transaction(), which will correctly handle stale tid's.
As a bonus, jbd2_complete_transaction() will avoid locking
j_state_lock for writing unless a commit needs to be started. This
should have a small (but probably not measurable) improvement for
ext4's scalability.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reported-by: Ben Hutchings <ben@decadent.org.uk>
Reported-by: George Barnett <gbarnett@atlassian.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d69f3bad4675ac519d41ca2b11e1c00ca115cecd upstream.
Trying to run an application which was trying to put data into half of
memory using shmget(), we found that having a shmall value below 8EiB-8TiB
would prevent us from using anything more than 8TiB. By setting
kernel.shmall greater than 8EiB-8TiB would make the job work.
In the newseg() function, ns->shm_tot which, at 8TiB is INT_MAX.
ipc/shm.c:
458 static int newseg(struct ipc_namespace *ns, struct ipc_params *params)
459 {
...
465 int numpages = (size + PAGE_SIZE -1) >> PAGE_SHIFT;
...
474 if (ns->shm_tot + numpages > ns->shm_ctlall)
475 return -ENOSPC;
[akpm@linux-foundation.org: make ipc/shm.c:newseg()'s numpages size_t, not int]
Signed-off-by: Robin Holt <holt@sgi.com>
Reported-by: Alex Thorlton <athorlton@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e56fb2874015370e3b7f8d85051f6dce26051df9 upstream.
threadgroup_lock() takes signal->cred_guard_mutex to ensure that
thread_group_leader() is stable. This doesn't look nice, the scope of
this lock in do_execve() is huge.
And as Dave pointed out this can lead to deadlock, we have the
following dependencies:
do_execve: cred_guard_mutex -> i_mutex
cgroup_mount: i_mutex -> cgroup_mutex
attach_task_by_pid: cgroup_mutex -> cred_guard_mutex
Change de_thread() to take threadgroup_change_begin() around the
switch-the-leader code and change threadgroup_lock() to avoid
->cred_guard_mutex.
Note that de_thread() can't sleep with ->group_rwsem held, this can
obviously deadlock with the exiting leader if the writer is active, so it
does threadgroup_change_end() before schedule().
Reported-by: Dave Jones <davej@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 712317ad97f41e738e1a19aa0a6392a78a84094e upstream.
We should store file xattrs in struct cfent instead of struct cftype,
because cftype is a type while cfent is object instance of cftype.
For example each cgroup has a tasks file, and each tasks file is
associated with a uniq cfent, but all those files share the same
struct cftype.
Alexey Kodanev reported a crash, which can be reproduced:
# mount -t cgroup -o xattr /sys/fs/cgroup
# mkdir /sys/fs/cgroup/test
# setfattr -n trusted.value -v test_value /sys/fs/cgroup/tasks
# rmdir /sys/fs/cgroup/test
# umount /sys/fs/cgroup
oops!
In this case, simple_xattrs_free() will free the same struct simple_xattrs
twice.
tj: Dropped unused local variable @cft from cgroup_diput().
Reported-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Signed-off-by: Li Zefan <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e08b34e86dfdb72a62196ce0f03d33f48958d8b9 upstream.
The commit [b209c4df: ALSA: emu10k1: cache emu1010 firmware] broke the
firmware loading of the dock, just (mistakenly) ignoring a different
firmware for docks on some models. This patch revives them again.
Bugzilla: https://bugs.archlinux.org/task/34865
Reported-and-tested-by: Tobias Powalowski <tobias.powalowski@googlemail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 6ee8630e02be6dd89926ca0fbc21af68b23dc087 upstream.
On architectures where a pgd entry may be shared between user and kernel
(e.g. ARM+LPAE), freeing page tables needs a ceiling other than 0.
This patch introduces a generic USER_PGTABLES_CEILING that arch code can
override. It is the responsibility of the arch code setting the ceiling
to ensure the complete freeing of the page tables (usually in
pgd_free()).
[catalin.marinas@arm.com: commit log; shift_arg_pages(), asm-generic/pgtables.h changes]
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
When NCQ is enabled, the SATA controller does not support
DMA setup FIS with autoactivate enabled from the device. The SATA host
may timeout without finishing the transaction.
This will have a minor performance impact as disabling the
auto-activate feature requires the device to send a DMA setup as
well as a DMA activate FIS to enable reception of the first data
FIS.
Software will set a flag and check if device enables DMA AA by default,
then driver will:
1. Disable the DMA setup auto-activate feature by a set features command.
2. if 1# fail, then disable NCQ by setting the queue depth to one.
Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com>
Change-Id: Icd5b9811eb4cb8342532c5c6238282b6c42cc7fa
Reviewed-on: http://git.am.freescale.net:8181/2226
Reviewed-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
Tested-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
|
|
The hassle of getting refcounting right was greater than the hassle
of keeping a list of devices to destroy on VM exit.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
|
|
Enabling this capability connects the vcpu to the designated in-kernel
MPIC. Using explicit connections between vcpus and irqchips allows
for flexibility, but the main benefit at the moment is that it
simplifies the code -- KVM doesn't need vm-global state to remember
which MPIC object is associated with this vm, and it doesn't need to
care about ordering between irqchip creation and vcpu creation.
Signed-off-by: Scott Wood <scottwood@freescale.com>
[agraf: add stub functions for kvmppc_mpic_{dis,}connect_vcpu]
Signed-off-by: Alexander Graf <agraf@suse.de>
|
|
Hook the MPIC code up to the KVM interfaces, add locking, etc.
Signed-off-by: Scott Wood <scottwood@freescale.com>
[agraf: add stub function for kvmppc_mpic_set_epr, non-booke, 64bit]
Signed-off-by: Alexander Graf <agraf@suse.de>
|
|
Currently, devices that are emulated inside KVM are configured in a
hardcoded manner based on an assumption that any given architecture
only has one way to do it. If there's any need to access device state,
it is done through inflexible one-purpose-only IOCTLs (e.g.
KVM_GET/SET_LAPIC). Defining new IOCTLs for every little thing is
cumbersome and depletes a limited numberspace.
This API provides a mechanism to instantiate a device of a certain
type, returning an ID that can be used to set/get attributes of the
device. Attributes may include configuration parameters (e.g.
register base address), device state, operational commands, etc. It
is similar to the ONE_REG API, except that it acts on devices rather
than vcpus.
Both device types and individual attributes can be tested without having
to create the device or get/set the attribute, without the need for
separately managing enumerated capabilities.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Conflicts:
include/linux/kvm_host.h
include/uapi/linux/kvm.h
Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
|
|
Setting up IRQ routes is nothing IOAPIC specific. Extract everything
that really is generic code into irqchip.c and only leave the ioapic
specific bits to irq_comm.c.
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Conflicts:
virt/kvm/irq_comm.c
Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
|
|
The current irq_comm.c file contains pieces of code that are generic
across different irqchip implementations, as well as code that is
fully IOAPIC specific.
Split the generic bits out into irqchip.c.
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Conflicts:
virt/kvm/irq_comm.c
Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
|
|
The prototype has been stale for a while, I can't spot any real function
define behind it. Let's just remove it.
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Conflicts:
include/linux/kvm_host.h
Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
|
|
We have a capability enquire system that allows user space to ask kvm
whether a feature is available.
The point behind this system is that we can have different kernel
configurations with different capabilities and user space can adjust
accordingly.
Because features can always be non existent, we can drop any #ifdefs
on CAP defines that could be used generically, like the irq routing
bits. These can be easily reused for non-IOAPIC systems as well.
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Quite a bit of code in KVM has been conditionalized on availability of
IOAPIC emulation. However, most of it is generically applicable to
platforms that don't have an IOPIC, but a different type of irq chip.
Make code that only relies on IRQ routing, not an APIC itself, on
CONFIG_HAVE_KVM_IRQ_ROUTING, so that we can reuse it later.
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
|
|
The concept of routing interrupt lines to an irqchip is nothing
that is IOAPIC specific. Every irqchip has a maximum number of pins
that can be linked to irq lines.
So let's add a new define that allows us to reuse generic code for
non-IOAPIC platforms.
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
|
|
There are important bug fixes in the upstream patches
that got applied. This revert commit removes the old patches to
prepare for the new set.
------------------------------------------------------------------
Revert "KVM: PPC: MPIC: Restrict to e500 platforms"
This reverts commit 540983d28a0dbe60b8bd08d96448b6544ad60293.
Revert "KVM: PPC: MPIC: Add support for KVM_IRQ_LINE"
This reverts commit ce5692ad437dfa0e7eb35918c40cdbd119b50909.
Revert "KVM: PPC: Support irq routing and irqfd for in-kernel MPIC"
This reverts commit 4ad8621c44d1420090241eaa8d5594e72ae6e05f.
Revert "KVM: Move irqfd resample cap handling to generic code"
This reverts commit e5557be2f787cff8f4daab4b39d38b5822f41390.
Revert "KVM: Move irq routing setup to irqchip.c"
This reverts commit 4486cf9a7a4d823c43b54e1b6280f41bcc59022d.
Revert "KVM: Extract generic irqchip logic into irqchip.c"
This reverts commit 0028971f3b4251cc231989dd9570f426d8472a2b.
Revert "KVM: Move irq routing to generic code"
This reverts commit e144029a9451b391afcbe123d1895523654e2bc5.
Revert "KVM: Remove kvm_get_intr_delivery_bitmask"
This reverts commit a7e10a68a247bbcde502947cf2fd4a7722c30512.
Revert "KVM: Drop __KVM_HAVE_IOAPIC condition on irq routing"
This reverts commit 38deef57a0eef339636c8fb8b8207a5a24e241ad.
Revert "KVM: Introduce CONFIG_HAVE_KVM_IRQ_ROUTING"
This reverts commit 3b196a30f460bf0f1bd6737266ed479a597a5b58.
Revert "KVM: Add KVM_IRQCHIP_NUM_PINS in addition to KVM_IOAPIC_NUM_PINS"
This reverts commit 40602a78a8a12d17ca895ea0a2721441c155801f.
Revert "kvm/ppc/mpic: add KVM_CAP_IRQ_MPIC"
This reverts commit 36e75cd4b65376dbee78ad92b084f9428868fe29.
Revert "kvm/ppc/mpic: in-kernel MPIC emulation"
This reverts commit 13ded2807a22aceff940ca1e282897e36fb0ba47.
Revert "kvm/ppc/mpic: adapt to kernel style and environment"
This reverts commit ac811c14ba3229ecbd39f3dc0b7f4c43ded36308.
Revert "kvm/ppc/mpic: remove some obviously unneeded code"
This reverts commit a60b865ac8d5cf3bff1e4a7dd43b6d8aaed87391.
Revert "kvm/ppc/mpic: import hw/openpic.c from QEMU"
This reverts commit 256cdf3f6df0561883ce801fd29595ecd031209a.
Revert "kvm: add device control API"
This reverts commit 8c848b9ed8b15aaccfb54511b22b205afc14f2d6.
------------------------------------------------------------------
Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
|
|
Incoming packets are randomly corrucpted by h/w resulting
in varying errors. This workaround makes FS as default mode
in all affected socs by
- Disabling HS chirp signalling
- Forcing EPS field of all packets to FS
This errata does not affect FS mode.
Forces all HS devices to connect in FS mode for all socs
affected by this erratum:
P3041 and P2041 rev 1.0 and 1.1
P5020 and P5010 rev 1.0 and 2.0
P5040 and P1010 rev 1.0
Workaround can be disabled by mentioning "no_erratum_a005275"
in hwconfig string (in u-boot command line)
Signed-off-by: Ramneek Mehresh <ramneek.mehresh@freescale.com>
Change-Id: Ie7b75b033220e4be44b5c769d7c187928d84dd6d
Reviewed-on: http://git.am.freescale.net:8181/1435
Reviewed-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
Tested-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
|
|
Conflicts:
include/linux/preempt.h
|
|
[ Upstream commit 83f1b4ba917db5dc5a061a44b3403ddb6e783494 ]
Commit 257b5358b32f ("scm: Capture the full credentials of the scm
sender") changed the credentials passing code to pass in the effective
uid/gid instead of the real uid/gid.
Obviously this doesn't matter most of the time (since normally they are
the same), but it results in differences for suid binaries when the wrong
uid/gid ends up being used.
This just undoes that (presumably unintentional) part of the commit.
Reported-by: Andy Lutomirski <luto@amacapital.net>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Serge E. Hallyn <serge@hallyn.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 124dff01afbdbff251f0385beca84ba1b9adda68 ]
Commit 130549fe ("netfilter: reset nf_trace in nf_reset") added code
to reset nf_trace in nf_reset(). This is wrong and unnecessary.
nf_reset() is used in the following cases:
- when passing packets up the the socket layer, at which point we want to
release all netfilter references that might keep modules pinned while
the packet is queued. nf_trace doesn't matter anymore at this point.
- when encapsulating or decapsulating IPsec packets. We want to continue
tracing these packets after IPsec processing.
- when passing packets through virtual network devices. Only devices on
that encapsulate in IPv4/v6 matter since otherwise nf_trace is not
used anymore. Its not entirely clear whether those packets should
be traced after that, however we've always done that.
- when passing packets through virtual network devices that make the
packet cross network namespace boundaries. This is the only cases
where we clearly want to reset nf_trace and is also what the
original patch intended to fix.
Add a new function nf_reset_trace() and use it in dev_forward_skb() to
fix this properly.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 4543fbefe6e06a9e40d9f2b28d688393a299f079 ]
A few drivers use dev_uc_sync/unsync to synchronize the
address lists from master down to slave/lower devices. In
some cases (bond/team) a single address list is synched down
to multiple devices. At the time of unsync, we have a leak
in these lower devices, because "synced" is treated as a
boolean and the address will not be unsynced for anything after
the first device/call.
Treat "synced" as a count (same as refcount) and allow all
unsync calls to work.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Updated the API for the Traffic Manager counter according to the
implementation from QMan.
Signed-off-by: Aurelian Zanoschi <Aurelian.Zanoschi@freescale.com>
Change-Id: I3e0985d4dc402ba59754cec762fdd3a9210e938a
Reviewed-on: http://git.am.freescale.net:8181/2242
Reviewed-by: Floarea Anca Jeanina-B12569 <anca.floarea@freescale.com>
Reviewed-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
Tested-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
|
|
Added support for non consistent storage profile reassembly counter
for FMANv3 capable platforms. For non FMANv3 platforms the driver will
accept the stat selection, but will always return 0
Signed-off-by: Aurelian Zanoschi <Aurelian.Zanoschi@freescale.com>
Change-Id: I27501de84499c1db5085510eb7c320709786617e
Reviewed-on: http://git.am.freescale.net:8181/2241
Reviewed-by: Floarea Anca Jeanina-B12569 <anca.floarea@freescale.com>
Reviewed-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
Tested-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
|
|
The purpose of the DPA Stats module is to provide to the application
a unitary method for retrieving counters that are spread at different
hardware or software locations.
Signed-off-by: Anca Jeanina FLOAREA <anca.floarea@freescale.com>
Change-Id: I3b4d886ef5aab00f6de6a330e068b7401bc24b6c
Reviewed-on: http://git.am.freescale.net:8181/2237
Reviewed-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
Tested-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
|
|
The DPA IPSec component exports a set of functions used to:
- initialize the DPA IPSec module internal data structures
- create and configure full inbound IPSec hardware accelerated paths
- create and configure full outbound IPSec hardware accelerated paths
- replace expired SAs (after rekeying) without packet loss
During the initialization phase the DPA IPSec implementation
performs a series of actions meant to remove the need to perform
memory allocations and hardware/software object initializations
during the runtime phases.
INBOUND PATH:
_____________ ______________ _______________________________ __________
|| SA lookup ||--> || Decryption || --> || Inbound Policy Verification || --> || App Rx ||
||___________|| ||____________|| ||_____________________________|| ||________||
The inbound processing of an encrypted packet begins by determining the SA
that will be used for decryption and authentication. In accordance to the
RFC the packets are classified based on a 3-tuple that uniquely identifies
the SA. This 3-tuple is formed from:
- the destination IP address in the IP header of the encrypted packet
- the value of the IP protocol field in the IP header of the encrypted packet
- the value in the SPI field in the ESP header
A special case is that were the encrypted packets are encapsulated in an
UDP header in order to support NAT traversal. In this case the
classification key should contain the following fields:
- the destination IP address in the IP header of the encrypted packet
- the IP protocol field in the IP header of the encrypted packet
- the SPI field in the ESP header
- the source UDP port in the UDP header
- the destination UDP port in the UDP header
This lookup is offloaded to FMAN by means of classifier API and FMAN API.
When an encrypted packet matches an offloaded key, it is directed by the
hardware into the decryption process by enquing the packet to a SEC frame
queue (FQ). A shared descriptor (representing the decryption SA) is set on
this FQ and the SEC will begin the decryption process and then place the
clear text packet to a FQ that is input for an offline port (OH). Processing
continues with inbound policy verification done on OH using the FMAN hardware.
After this step the packet is enqueued to a FQ created by the application
which benefits of IPSec security
OUTBOUND PATH:
__________ _________________ ______________ __________________
|| App Tx || --> || Policy Lookup || --> || Encryption || --> || Error Checking ||
||________|| ||_______________|| ||____________|| ||________________||
The primary function of the policy lookup block is to classify frames and
determine the correct SA on which they should be processed.
The DPA IPSec can be configured to build a policy key using any subset of the following fields:
- masked source IP address
- masked destination IP address
- optionally masked IP protocol field value
- masked source port value / ICMP type field value
- masked destination port value / ICMP code field value
IP fragmentation can be configured per policy and is performed, if required,
on the packets before being sent to the Encryption Block. A fragmentation header
manipulation identifier has to be passed when offloading the policy. If a
clear text packet hits an offloaded policy the packet will be directed by FMAN
hardware into the proper FQ for SEC processing.
After the SEC has completed all the required operations, a new frame is created
containing the ESP encapsulated packet. This frame will be sent to
the next block for further processing i.e input to offline port where error checking
is done prior to forwarding the packet to an application desired FQ based on the
SA that processed that packet.
Signed-off-by: Andrei Varvara <andrei.varvara@freescale.com>
Signed-off-by: Mihai Serb
Change-Id: Id8a4afa1cfda42dd2ba1408614a5900cb7b80cee
Reviewed-on: http://git.am.freescale.net:8181/2235
Reviewed-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
Tested-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
|
|
The packet classification offloading driver implements its functionalities
using the NetCommSw FMan driver. It exposes an easy-to-use API which can be
called either from kernel space or from user space (through the
classification offloading driver wrapper).
It is able to create or import (from FMD resources) 3 types of tables -
exact match table, indexed table and HASH table. Imported tables can be
either empty or prefilled.
It offers an API to insert, remove, or modify entries. It allows the users
to classify and
- enqueue
- multicast
- discard
- re-classify or
- return-to-KeyGen
network packets.
It is able to create or import (from FMD resources) header manipulation
operations and attach them to table entries. It allows runtime modification
of existing (created or imported) header manipulation operations.
It offers an API to create or import (from FMD resources) multicast groups.
It allows the user to add or remove members to existing multicast groups.
Signed-off-by: Marian Chereji <marian.chereji@freescale.com>
Signed-off-by: Radu Bulie <radu.bulie@freescale.com>
Change-Id: I854c0c3c2eba6d6f441cb46e502e6dbc623c48d5
Reviewed-on: http://git.am.freescale.net:8181/2233
Reviewed-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
Tested-by: Fleming Andrew-AFLEMING <AFLEMING@freescale.com>
|
|
Completions have no long lasting callbacks and therefor do not need
the complex waitqueue variant. Use simple waitqueues which reduces the
contention on the waitqueue lock.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
If a PI boosted task policy/priority is modified by a setscheduler()
call we unconditionally dequeue and requeue the task if it is on the
runqueue even if the new priority is lower than the current effective
boosted priority. This can result in undesired reordering of the
priority bucket list.
If the new priority is less or equal than the current effective we
just store the new parameters in the task struct and leave the
scheduler class and the runqueue untouched. This is handled when the
task deboosts itself. Only if the new priority is higher than the
effective boosted priority we apply the change immediately.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Cc: stable-rt@vger.kernel.org
|
|
wait_queue is a swiss army knife and in most of the cases the
complexity is not needed. For RT waitqueues are a constant source of
trouble as we can't convert the head lock to a raw spinlock due to
fancy and long lasting callbacks.
Provide a slim version, which allows RT to replace wait queues. This
should go mainline as well, as it lowers memory consumption and
runtime overhead.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
On RT write_seqcount_begin() disables preemption and device_rename()
allocates memory with GFP_KERNEL and grabs later the sysfs_mutex mutex.
Since I don't see a reason why this can't be a mutex, make it one. We
probably don't have that much reads at the same time in the hot path.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
It has become an obsession to mitigate the determinism vs. throughput
loss of RT. Looking at the mainline semantics of preemption points
gives a hint why RT sucks throughput wise for ordinary SCHED_OTHER
tasks. One major issue is the wakeup of tasks which are right away
preempting the waking task while the waking task holds a lock on which
the woken task will block right after having preempted the wakee. In
mainline this is prevented due to the implicit preemption disable of
spin/rw_lock held regions. On RT this is not possible due to the fully
preemptible nature of sleeping spinlocks.
Though for a SCHED_OTHER task preempting another SCHED_OTHER task this
is really not a correctness issue. RT folks are concerned about
SCHED_FIFO/RR tasks preemption and not about the purely fairness
driven SCHED_OTHER preemption latencies.
So I introduced a lazy preemption mechanism which only applies to
SCHED_OTHER tasks preempting another SCHED_OTHER task. Aside of the
existing preempt_count each tasks sports now a preempt_lazy_count
which is manipulated on lock acquiry and release. This is slightly
incorrect as for lazyness reasons I coupled this on
migrate_disable/enable so some other mechanisms get the same treatment
(e.g. get_cpu_light).
Now on the scheduler side instead of setting NEED_RESCHED this sets
NEED_RESCHED_LAZY in case of a SCHED_OTHER/SCHED_OTHER preemption and
therefor allows to exit the waking task the lock held region before
the woken task preempts. That also works better for cross CPU wakeups
as the other side can stay in the adaptive spinning loop.
For RT class preemption there is no change. This simply sets
NEED_RESCHED and forgoes the lazy preemption counter.
Initial test do not expose any observable latency increasement, but
history shows that I've been proven wrong before :)
The lazy preemption mode is per default on, but with
CONFIG_SCHED_DEBUG enabled it can be disabled via:
# echo NO_PREEMPT_LAZY >/sys/kernel/debug/sched_features
and reenabled via
# echo PREEMPT_LAZY >/sys/kernel/debug/sched_features
The test results so far are very machine and workload dependent, but
there is a clear trend that it enhances the non RT workload
performance.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The netfilter code relies only on the implicit semantics of
local_bh_disable() for serializing wt_write_recseq sections. RT breaks
that and needs explicit serialization here.
Reported-by: Peter LaDow <petela@gocougs.wsu.edu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable-rt@vger.kernel.org
|
|
Make SLUB RT aware and remove the restriction in Kconfig.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|