- Apr 05, 2011
-
-
Tsutomu Itoh authored
Call btrfs_end_transaction() if btrfs_commit_transaction_async() fails. Signed-off-by:
Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by:
Chris Mason <chris.mason@oracle.com>
-
- Mar 28, 2011
-
-
liubo authored
setflags ioctl should return error when any checks fail. Signed-off-by:
Liu Bo <liubo2009@cn.fujitsu.com> Reviewed-by:
David Sterba <dsterba@suse.cz> Signed-off-by:
Chris Mason <chris.mason@oracle.com>
-
Li Dongyang authored
We take an free extent out from allocator, trim it, then put it back, but before we trim the block group, we should make sure the block group is cached, so plus a little change to make cache_block_group() run without a transaction. Signed-off-by:
Li Dongyang <lidongyang@novell.com> Signed-off-by:
Chris Mason <chris.mason@oracle.com>
-
Liu Bo authored
Data compression and data cow are controlled across the entire FS by mount options right now. ioctls are needed to set this on a per file or per directory basis. This has been proposed previously, but VFS developers wanted us to use generic ioctls rather than btrfs-specific ones. According to Chris's comment, there should be just one true compression method(probably LZO) stored in the super. However, before this, we would wait for that one method is stable enough to be adopted into the super. So I list it as a long term goal, and just store it in ram today. After applying this patch, we can use the generic "FS_IOC_SETFLAGS" ioctl to control file and directory's datacow and compression attribute. NOTE: - The compression type is selected by such rules: If we mount btrfs with compress options, ie, zlib/lzo, the type is it. Otherwise, we'll use the default compress type (zlib today). v1->v2: - rebase to the latest btrfs. v2->v3: - fix a problem, i.e. when a file is set NOCOW via mount option, then this NOCOW will be screwed by inheritance from parent directory. Signed-off-by:
Liu Bo <liubo2009@cn.fujitsu.com> Signed-off-by:
Chris Mason <chris.mason@oracle.com>
-
Tsutomu Itoh authored
This patch changes some BUG_ON() to the error return. (but, most callers still use BUG_ON()) Signed-off-by:
Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by:
Chris Mason <chris.mason@oracle.com>
-
- Mar 17, 2011
-
-
Josef Bacik authored
If we cannot truncate an inode for some reason we will never delete the orphan item associated with that inode, which means that we will loop forever in btrfs_orphan_cleanup. Instead of doing this just return error so we fail to mount. It sucks, but hey it's better than hanging. Thanks, Signed-off-by:
Josef Bacik <josef@redhat.com>
-
- Feb 16, 2011
-
-
Li Zefan authored
- Check user-specified flags correctly - Check the inode owership - Search root item in root tree but not fs tree Reported-by:
Dan Rosenberg <drosenberg@vsecurity.com> Signed-off-by:
Li Zefan <lizf@cn.fujitsu.com> Signed-off-by:
Chris Mason <chris.mason@oracle.com>
-
- Feb 14, 2011
-
-
Dan Rosenberg authored
Commit bf5fc093 refactored btrfs_ioctl_space_info() and introduced several security issues. space_args.space_slots is an unsigned 64-bit type controlled by a possibly unprivileged caller. The comparison as a signed int type allows providing values that are treated as negative and cause the subsequent allocation size calculation to wrap, or be truncated to 0. By providing a size that's truncated to 0, kmalloc() will return ZERO_SIZE_PTR. It's also possible to provide a value smaller than the slot count. The subsequent loop ignores the allocation size when copying data in, resulting in a heap overflow or write to ZERO_SIZE_PTR. The fix changes the slot count type and comparison typecast to u64, which prevents truncation or signedness errors, and also ensures that we don't copy more data than we've allocated in the subsequent loop. Note that zero-size allocations are no longer possible since there is already an explicit check for space_args.space_slots being 0 and truncation of this value is no longer an issue. Signed-off-by:
Dan Rosenberg <drosenberg@vsecurity.com> Signed-off-by:
Josef Bacik <josef@redhat.com> Reviewed-by:
Josef Bacik <josef@redhat.com> Signed-off-by:
Chris Mason <chris.mason@oracle.com>
-
- Feb 01, 2011
-
-
Tsutomu Itoh authored
The error check of btrfs_start_transaction() is added, and the mistake of the error check on several places is corrected. Signed-off-by:
Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by:
Chris Mason <chris.mason@oracle.com>
-
- Jan 28, 2011
-
-
Tsutomu Itoh authored
btrfs_start_ioctl_transaction() returns ERR_PTR(), not NULL. So, it is necessary to use IS_ERR() to check the return value. Signed-off-by:
Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by:
Chris Mason <chris.mason@oracle.com>
-
Tsutomu Itoh authored
The error check of btrfs_join_transaction()/btrfs_join_transaction_nolock() is added, and the mistake of the error check in several places is corrected. For more stable Btrfs, I think that we should reduce BUG_ON(). But, I think that long time is necessary for this. So, I propose this patch as a short-term solution. With this patch: - To more stable Btrfs, the part that should be corrected is clarified. - The panic isn't done by the NULL pointer reference etc. (even if BUG_ON() is increased temporarily) - The error code is returned in the place where the error can be easily returned. As a long-term plan: - BUG_ON() is reduced by using the forced-readonly framework, etc. Signed-off-by:
Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by:
Chris Mason <chris.mason@oracle.com>
-
- Jan 26, 2011
-
-
Li Zefan authored
Suppose: - the source extent is: [0, 100] - the src offset is 10 - the clone length is 90 - the dest offset is 0 This statement: new_key.offset = key.offset + destoff - off will produce such an extent for the dest file: [ino, BTRFS_EXTENT_DATA_KEY, -10] , which is obviously wrong. Signed-off-by:
Li Zefan <lizf@cn.fujitsu.com>
-
- Dec 23, 2010
-
-
Li Zefan authored
This allows us to set a snapshot or a subvolume readonly or writable on the fly. Usage: Set BTRFS_SUBVOL_RDONLY of btrfs_ioctl_vol_arg_v2->flags, and then call ioctl(BTRFS_IOCTL_SUBVOL_SETFLAGS); Changelog for v3: - Change to pass __u64 as ioctl parameter. Changelog for v2: - Add _GETFLAGS ioctl. - Check if the passed fd is the root of a subvolume. - Change the name from _SNAP_SETFLAGS to _SUBVOL_SETFLAGS. Signed-off-by:
Li Zefan <lizf@cn.fujitsu.com>
-
Li Zefan authored
Usage: Set BTRFS_SUBVOL_RDONLY of btrfs_ioctl_vol_arg_v2->flags, and call ioctl(BTRFS_I0CTL_SNAP_CREATE_V2). Implementation: - Set readonly bit of btrfs_root_item->flags. - Add readonly checks in btrfs_permission (inode_permission), btrfs_setattr, btrfs_set/remove_xattr and some ioctls. Changelog for v3: - Eliminate btrfs_root->readonly, but check btrfs_root->root_item.flags. - Rename BTRFS_ROOT_SNAP_RDONLY to BTRFS_ROOT_SUBVOL_RDONLY. Signed-off-by:
Li Zefan <lizf@cn.fujitsu.com>
-
Li Zefan authored
Split it into two functions for two different ioctls, since they share no common code. Signed-off-by:
Li Zefan <lizf@cn.fujitsu.com>
-
- Dec 22, 2010
-
-
Li Zefan authored
Update defrag ioctl, so one can choose lzo or zlib when turning on compression in defrag operation. Changelog: v1 -> v2 - Add incompability flag. - Fix to check invalid compress type. Signed-off-by:
Li Zefan <lizf@cn.fujitsu.com>
-
Li Zefan authored
Make the code aware of compression type, instead of always assuming zlib compression. Also make the zlib workspace function as common code for all compression types. Signed-off-by:
Li Zefan <lizf@cn.fujitsu.com>
-
- Dec 10, 2010
-
-
Li Zefan authored
If we had reserved some bytes in struct btrfs_ioctl_vol_args, we wouldn't have to create a new structure for async snapshot creation. Here we convert async snapshot ioctl to use a more generic ABI, as we'll add more ioctls for snapshots/subvolumes in the future, readonly snapshots for example. Signed-off-by:
Li Zefan <lizf@cn.fujitsu.com> Signed-off-by:
Chris Mason <chris.mason@oracle.com>
-
Sage Weil authored
We were incorrectly taking the async path even for the sync ioctls by passing in &transid unconditionally. There's ample room for further cleanup here, but this keeps the fix simple. Signed-off-by:
Sage Weil <sage@newdream.net> Reviewed-by:
Li Zefan <lizf@cn.fujitsu.com> Signed-off-by:
Chris Mason <chris.mason@oracle.com>
-
- Nov 22, 2010
-
-
Josef Bacik authored
There are lots of places where we do dentry->d_parent->d_inode without holding the dentry->d_lock. This could cause problems with rename. So instead we need to use dget_parent() and hold the reference to the parent as long as we are going to use it's inode and then dput it at the end. Signed-off-by:
Josef Bacik <josef@redhat.com> Cc: raven@themaw.net Signed-off-by:
Chris Mason <chris.mason@oracle.com>
-
Li Zefan authored
Set src_offset = 0, src_length = 20K, dest_offset = 20K. And the original filesize of the dest file 'file2' is 30K: # ls -l /mnt/file2 -rw-r--r-- 1 root root 30720 Nov 18 16:42 /mnt/file2 Now clone file1 to file2, the dest file should be 40K, but it still shows 30K: # ls -l /mnt/file2 -rw-r--r-- 1 root root 30720 Nov 18 16:42 /mnt/file2 Signed-off-by:
Li Zefan <lizf@cn.fujitsu.com> Signed-off-by:
Chris Mason <chris.mason@oracle.com>
-
Li Zefan authored
We've done the check for src_offset and src_length, and We should also check dest_offset, otherwise we'll corrupt the destination file: (After cloning file1 to file2 with unaligned dest_offset) # cat /mnt/file2 cat: /mnt/file2: Input/output error Signed-off-by:
Li Zefan <lizf@cn.fujitsu.com> Signed-off-by:
Chris Mason <chris.mason@oracle.com>
-
- Oct 30, 2010
-
-
Sage Weil authored
Add a mount option user_subvol_rm_allowed that allows users to delete a (potentially non-empty!) subvol when they would otherwise we allowed to do an rmdir(2). We duplicate the may_delete() checks from the core VFS code to implement identical security checks (minus the directory size check). We additionally require that the user has write+exec permission on the subvol root inode. Signed-off-by:
Sage Weil <sage@newdream.net> Signed-off-by:
Chris Mason <chris.mason@oracle.com>
-
Sage Weil authored
There is no reason to force an immediate commit when deleting a snapshot. Users have some expectation that space from a deleted snapshot be freed immediately, but even if we do commit the reclaim is a background process. If users _do_ want the deletion to be durable, they can call 'sync'. Signed-off-by:
Sage Weil <sage@newdream.net> Signed-off-by:
Chris Mason <chris.mason@oracle.com>
-
Sage Weil authored
Create a snap without waiting for it to commit to disk. The ioctl is ordered such that subsequent operations will not be contained by the created snapshot, and the commit is initiated, but the ioctl does not wait for the snapshot to commit to disk. We return the specific transid to userspace so that an application can wait for this specific snapshot creation to commit via the WAIT_SYNC ioctl. Signed-off-by:
Sage Weil <sage@newdream.net> Signed-off-by:
Chris Mason <chris.mason@oracle.com>
-
- Oct 29, 2010
-
-
Sage Weil authored
START_SYNC will start a sync/commit, but not wait for it to complete. Any modification started after the ioctl returns is guaranteed not to be included in the commit. If a non-NULL pointer is passed, the transaction id will be returned to userspace. WAIT_SYNC will wait for any in-progress commit to complete. If a transaction id is specified, the ioctl will block and then return (success) when the specified transaction has committed. If it has already committed when we call the ioctl, it returns immediately. If the specified transaction doesn't exist, it returns EINVAL. If no transaction id is specified, WAIT_SYNC will wait for the currently committing transaction to finish it's commit to disk. If there is no currently committing transaction, it returns success. These ioctls are useful for applications which want to impose an ordering on when fs modifications reach disk, but do not want to wait for the full (slow) commit process to do so. Picky callers can take the transid returned by START_SYNC and feed it to WAIT_SYNC, and be certain to wait only as long as necessary for the transaction _they_ started to reach disk. Sloppy callers can START_SYNC and WAIT_SYNC without a transid, and provided they didn't wait too long between the calls, they will get the same result. However, if a second commit starts before they call WAIT_SYNC, they may end up waiting longer for it to commit as well. Even so, a START_SYNC+WAIT_SYNC still guarantees that any operation completed before the START_SYNC reaches disk. Signed-off-by:
Sage Weil <sage@newdream.net> Signed-off-by:
Chris Mason <chris.mason@oracle.com>
-
Sage Weil authored
I'm no lockdep expert, but this appears to make the lockdep warning go away for the i_mutex locking in the clone ioctl. Signed-off-by:
Sage Weil <sage@newdream.net> Signed-off-by:
Chris Mason <chris.mason@oracle.com>
-
Sage Weil authored
We had an edge case issue where the requested range was just following an existing extent. Instead of skipping to the next extent, we used the previous one which lead to having zero sized extents. Signed-off-by:
Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by:
Chris Mason <chris.mason@oracle.com>
-
Sage Weil authored
The lookup_first_ordered_extent() was done on the wrong inode, and the ->delalloc_bytes test was wrong, as the following btrfs_wait_ordered_range() would only invoke a range write and wouldn't write the entire file data range. Also, a bad parameter was passed to btrfs_wait_ordered_range(). Signed-off-by:
Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by:
Chris Mason <chris.mason@oracle.com>
-
Andi Kleen authored
These are all the cases where a variable is set, but not read which are not bugs as far as I can see, but simply leftovers. Still needs more review. Found by gcc 4.6's new warnings Signed-off-by:
Andi Kleen <ak@linux.intel.com> Cc: Chris Mason <chris.mason@oracle.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Chris Mason <chris.mason@oracle.com>
-
Julia Lawall authored
Use memdup_user when user data is immediately copied into the allocated region. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/ ) // <smpl> @@ expression from,to,size,flag; position p; identifier l1,l2; @@ - to = \(kmalloc@p\|kzalloc@p\)(size,flag); + to = memdup_user(from,size); if ( - to==NULL + IS_ERR(to) || ...) { <+... when != goto l1; - -ENOMEM + PTR_ERR(to) ...+> } - if (copy_from_user(to, from, size) != 0) { - <+... when != goto l2; - -EFAULT - ...+> - } // </smpl> Signed-off-by:
Julia Lawall <julia@diku.dk> Cc: Chris Mason <chris.mason@oracle.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Chris Mason <chris.mason@oracle.com>
-
- Oct 22, 2010
-
-
Josef Bacik authored
The new ENOSPC stuff broke the df ioctl since we no longer create seperate space info's for each RAID type. So instead, loop through each space info's raid lists so we can get the right RAID information which will allow the df ioctl to tell us RAID types again. Thanks, Signed-off-by:
Josef Bacik <josef@redhat.com>
-
- Jul 19, 2010
-
-
Dan Rosenberg authored
1. The BTRFS_IOC_CLONE and BTRFS_IOC_CLONE_RANGE ioctls should check whether the donor file is append-only before writing to it. 2. The BTRFS_IOC_CLONE_RANGE ioctl appears to have an integer overflow that allows a user to specify an out-of-bounds range to copy from the source file (if off + len wraps around). I haven't been able to successfully exploit this, but I'd imagine that a clever attacker could use this to read things he shouldn't. Even if it's not exploitable, it couldn't hurt to be safe. Signed-off-by:
Dan Rosenberg <dan.j.rosenberg@gmail.com> cc: stable@kernel.org Signed-off-by:
Chris Mason <chris.mason@oracle.com>
-
Sage Weil authored
The CLONE and CLONE_RANGE ioctls round up the range of extents being cloned to the block size when the range to clone extends to the end of file (this is always the case with CLONE). It was then using that offset when extending the destination file's i_size. Fix this by not setting i_size beyond the originally requested ending offset. This bug was introduced by a22285a6 (2.6.35-rc1). Signed-off-by:
Sage Weil <sage@newdream.net> Signed-off-by:
Chris Mason <chris.mason@oracle.com>
-
- Jun 11, 2010
-
-
Dan Carpenter authored
btrfs_lookup_dir_item() can return either ERR_PTRs or null. Signed-off-by:
Dan Carpenter <error27@gmail.com> Signed-off-by:
Chris Mason <chris.mason@oracle.com>
-
Dan Carpenter authored
This was added by a22285a6: "Btrfs: Integrate metadata reservation with start_transaction". If we goto out here then we skip all the unwinding and there are locks still held etc. Signed-off-by:
Dan Carpenter <error27@gmail.com> Signed-off-by:
Chris Mason <chris.mason@oracle.com>
-
- May 25, 2010
-
-
Yan, Zheng authored
reserve metadata space for handling orphan inodes Signed-off-by:
Yan Zheng <zheng.yan@oracle.com> Signed-off-by:
Chris Mason <chris.mason@oracle.com>
-
Yan, Zheng authored
Reserve metadata space for extent tree, checksum tree and root tree Signed-off-by:
Yan Zheng <zheng.yan@oracle.com> Signed-off-by:
Chris Mason <chris.mason@oracle.com>
-
Yan, Zheng authored
Introduce metadata reservation context for delayed allocation and update various related functions. This patch also introduces EXTENT_FIRST_DELALLOC control bit for set/clear_extent_bit. It tells set/clear_bit_hook whether they are processing the first extent_state with EXTENT_DELALLOC bit set. This change is important if set/clear_extent_bit involves multiple extent_state. Signed-off-by:
Yan Zheng <zheng.yan@oracle.com> Signed-off-by:
Chris Mason <chris.mason@oracle.com>
-
Yan, Zheng authored
Besides simplify the code, this change makes sure all metadata reservation for normal metadata operations are released after committing transaction. Changes since V1: Add code that check if unlink and rmdir will free space. Add ENOSPC handling for clone ioctl. Signed-off-by:
Yan Zheng <zheng.yan@oracle.com> Signed-off-by:
Chris Mason <chris.mason@oracle.com>
-